ADMAS UNIVERSITY POSTGRADUATE SCHOOL MSC PROGRAM BEAN CROP DISEASE DETECTION USING A MACHINE LEARNING ALGORITHM A Thesis Submitted to the Department of Computer Science for the Partial Fulfillment of the Requirements for the Degree of Master of Science in Computer Science By AGMAS GETNET AUGUST 2020 ADDIS ABABA, ETHIOPIA i Declaration I, Agmas Getnet,the under signed, declare that this thesis entitled: “BEAN CROP DISEASE DETECTION USING A MACHINE LEARNING ALGORITHM” is my original work. I have undertaken the research work independently with the guidance and support of the research advisor. This study has not been submitted for any degree or diploma program in this or any other institutions and that all sources of materials used for the thesis has been duly acknowledged. Declared by: Name____________________________ Signature: ________________________ Department: ______________________ Date: _____________________________ ii Certificate of Approval of Thesis School of Postgraduate Studies Admas University This is to certify that the thesis prepared by Agmas Getnet, entitled “BEAN CROP DISEASE DETECTION USING A MACHINE LEARNING ALGORITHM” and submitted in partial fulfilment of the requirements for the Degree of Masters of science in computer science complies with the regulations of the University and meets the accepted standards with respect to originality and quality. Name of Candidate: ___________________: Signature: _____________Date: _____________. Name of Advisor: _____________________: Signature: ______________Date: ____________. Signature of Board of Examiner`s: External examiner: ____________________Signature: ____________Date: ____________. Internal examiner: ____________________Signature: ____________Date: ____________. Dean, SGS: __________________________Signature: ____________Date: ____________. iii ABSTRACT Bean is one of the widely grown crops in the world. This crop is easily prone to various diseases such as Rust, Bacterial blight, angular leaf spot, Alternaria leaf spot, web blight and root rots. Among bacterial blight the most dangerous and widely occurring disease caused by Xanxomonasoryzaepv.pryzae. Now a day farmerand agricultural experts are identifying symptoms of the diseases by their vision, but they cannot differentiate types of the disease at its earliest stage of development. To know type of disease farmers, need to get guidance from the expert which helps to minimize time and cost besides knowing the disease type correctly. In order to minimize disease factor, it should be detected at its primary stage of development and pesticide is sprayed to the diseased plants. If the growth of disease extends its earliest stage of development, it cannot be controlled easily. In order to solve the problems a novel automated computer-based system is important and proposed for classification and early detection of diseases on bean crop using image processing with an image segmentation algorithm called Kmeans clustering and the machine learning algorithm, SVM classifier. In the identification of bean crop diseases, we have followed the steps image acquisition, image preprocessing, feature extraction, segmentation and classification. At the first beginning we have collected data’s as input and removing noises and resize of images with preprocessing step. In the third step segmentation step performed with dividing the images into clusters (1,2,3). Fourthly, feature extraction of images has been performed by extracting features of images like texture. Lastly the images classified to the appropriate group of disease it belongs by using the classifier, support vector machine with an accuracy of 96.77%. Keywords: Bean, Bacterial Blight, Image processing, K-means clustering, Image processing, SVM. iv ACKNOWLEDGEMENT First and foremost, thanks to God, the almighty, for his blessings throughout my research work. I would like to express my gratitude to my advisor Dr. Henok Mulugeta for his invaluable guidance, sincerity and motivation to accomplish the research. I am highly indebted to Admas University post graduate school for their guidance and constant supervision as well as for providing necessary information regarding this research and thanks to Ethiopian institute of Agricultural Research, Debre zeyt branch for their information. I would like to thank my parents. Thank You my mother, Mastewal Wondmnew, for your alarm to start master’s program and for your unremitting motivation and thanks to my sister Yealemmebrat Getnet. I would like to thanks my friends Addis Tsega, Addisu Gizachew,Meron Kassa and Shewakena Getnet for their support, inspiration, stimulating discussion and impetus. v TABLE OF CONTENTS ABSTRACT................................................................................................................................... iv TABLE OF CONTENTS ............................................................................................................... vi LIST Of FIGURES ........................................................................................................................ ix LIST Of TABLES ........................................................................................................................... x ABBREVIATIONS ....................................................................................................................... xi CHAPTER ONE ............................................................................................................................. 1 INTRODUCTION .......................................................................................................................... 1 1.1. Background .......................................................................................................................... 1 1.2. Statement of problem ........................................................................................................... 3 1.3. Objectives ............................................................................................................................ 4 1.3.1. General objective .......................................................................................................... 4 1.3.2. Specific objectives ........................................................................................................ 4 1.4. Scope and limitation of the research .................................................................................... 4 1.4.1. Scope of the research .................................................................................................... 4 1.4.2. Limitation of the research ............................................................................................. 4 1.5. Significance of the study ...................................................................................................... 4 1.6. Research organization .......................................................................................................... 5 CHAPTER TWO ............................................................................................................................ 6 LITERATURE REVIEW ............................................................................................................... 6 2.1. Introduction .......................................................................................................................... 6 2.2. Bean Diseases ...................................................................................................................... 8 2.3. Digital Image Processing ................................................................................................... 10 2.3.1. Image processing methods.......................................................................................... 11 2.3.2. Fundamental Steps of Digital Image Processing ........................................................ 12 vi 2.3.3. 2.4. Types of image processing ......................................................................................... 20 Related sources .................................................................................................................. 21 CHAPTER THREE ...................................................................................................................... 23 RESEARCH METHODOLOGY.................................................................................................. 23 3.1. Experimentation Tools ....................................................................................................... 23 3.2. Algorithm ........................................................................................................................... 24 3.3. Analysis and Design .......................................................................................................... 24 3.4. Data Collection and Dataset Preparation ........................................................................... 24 3.4.1. Data Collection ........................................................................................................... 24 3.4.2. Dataset Preparation ..................................................................................................... 24 3.5. Sampling techniques .......................................................................................................... 25 3.6. Materials and methods ....................................................................................................... 26 3.7. Evaluation Technique ........................................................................................................ 27 CHAPTER FOUR ......................................................................................................................... 29 PROPOSED SYSTEM MODEL .................................................................................................. 29 4.1. System Architecture ........................................................................................................... 29 4.2. Tasks of Image processing ................................................................................................. 32 4.2.1. Image Acquisition....................................................................................................... 32 4.2.2. Image Preprocessing ................................................................................................... 32 4.2.3. Image segmentation .................................................................................................... 34 4.2.4. Feature extraction ....................................................................................................... 37 4.2.5. Classification .............................................................................................................. 41 CHAPTER FIVE .......................................................................................................................... 45 RESULTS AND DISCUSSIONS ................................................................................................. 45 5.1. Introduction ........................................................................................................................ 45 vii 5.2. Data Set .............................................................................................................................. 45 5.3. Testing Techniques on MATLAB ..................................................................................... 46 5.4. Implementation .................................................................................................................. 46 5.4.1. Stage One: Image acquisition ..................................................................................... 47 5.4.2. Stage two: Image Preprocessing ................................................................................. 47 5.4.3. Stage Three: Image segmentation............................................................................... 50 5.4.4. Stage Four: Feature extraction .................................................................................... 52 5.4.5. Stage Five: Classification ........................................................................................... 53 CHAPTER SIX ............................................................................................................................. 56 CONCLUSION AND RECOMMENDATION ............................................................................ 56 6.1. Conclusion ......................................................................................................................... 56 6.2. Recommendation ............................................................................................................... 57 References ..................................................................................................................................... 59 viii LIST Of FIGURES Figure 2-1: Phases of plant disease detection system ................................................................... 14 Figure 2-2:Block diagram for Image Processing .......................................................................... 14 Figure 2-3:Framework of image processing operation ................................................................. 15 Figure 2-4: Low level Image processing ...................................................................................... 20 Figure 2-5:Middle level image processing ................................................................................... 21 Figure 4-1: Proposed system architecture ..................................................................................... 30 Figure 4-2: Flow chart to classify images ..................................................................................... 31 Figure 5-1: Conversion of RGB2HSI ........................................................................................... 47 Figure 5-2: Preprocessing images ................................................................................................. 48 Figure 5-3: Conversion of image to R, G and B and Histogram of the R, G and B ..................... 49 Figure 5-4: Histogram equalization .............................................................................................. 50 Figure 5-5:Conversion of RGB to L*a*b color ............................................................................ 51 Figure 5-6: K-mean clustering ...................................................................................................... 51 Figure 5-7:Bean Crop disease detection GUI ............................................................................... 54 Figure 5-8:Accuracy and Error rate detection of diseases ............................................................ 55 ix LIST Of TABLES Table 4-1: Summary of different segmentation techniques .......................................................... 35 Table 4-2: Summary of different color techniques ....................................................................... 38 Table 4-3: Summary of different texture feature extraction techniques ....................................... 39 Table 4-4: Summary of different classifiers ................................................................................. 41 Table 5-1: Accuracy value for each disease detection (%) ........................................................... 54 x ABBREVIATIONS SVM SUPPORT VECTOR MACHINE LAC LATIN AMERICAN CARIBBEAN BCMV- BEAN COMMON MOSAIC VIRUS BGMV BEAN GOLDEN MOSAIC VIRUS CBB COMMON BACTERIAL BLIGHT KG HA-1 KILOGRAM PER HECTAR GSM GLOBAL SYSTEM for MOBILE GPRS GENERAL PACKET RADIO SERVICES KNN K- NEAREST NEIGBOUR ANN ARTIFICIAL NURAL NETWORK RGB RED, GREEN, BLUE HSI HUE, SATURATION, INTENSITY MRMR MINIMUM REDUNANCY MAXIMUM RELEVANCE SGDM SPATIAL GRAY-LEVEL DEPENDENCE MATRICES FCM FUZZY C-MEANS IEEE INSTITUTE OF ELECTRICAL AND ELECTRONICS ENGINEERS PNN PROBABILISTIC NURAL NETWORK RBF RADIAL BASIS FUNCTION EIAR ETHIOPIAN INSTITUTE OF AGRICULTURAL RESEARCH TP TRUE POSITIVE TN TRUE NEGATIVE FN FALSE NEGATIVE FP FALSE POSITIVE GLCM GREY LEVEL CO-OCCURRENCE MATRIX GDP GROSS DOMESTIC PRODUCT JPG JOINT PHOTOGRAPHIC EXPERTS GROUP xi CHAPTER ONE INTRODUCTION 1.1. Background Agriculture bargains employment to a large population of the world. Its products are alsoa key need of every citizen. The study of agriculture, known as the agricultural sciences has led to a lot of development in this field like scientific farming, use of high-end technology in growing crops [1]. Bean is one of the widely grown crop in the world. In the year 2010 global bean production was approximately 23,816,123 tons, with 24.4 and 17.7% of the world production in LAC and Africa, respectively [2]. Bean is an important source of nutrients about 500 million people in parts of Africa and Latin America, representing 65% of total protein consumed, 32% of energy [3]. Minerals and nutrients such as iron, phosphorus, magnesium, potassium, calcium, zinc and folate (B vitamin) are found in beans and contribute to a balanced healthy diet [4]. It is assumed that Faba bean is familiarized to Ethiopia in the 16th century by Portuguese [1]. Economic significance of bean in Ethiopia is quite considerable since it represents one of the major foods and cash crops. By small scale farmers it is often grown as cash crop and used as a major food legume in many parts of the country where it is consumed in different types of traditional dishes [5]. Even though bean is easily disposed to various diseases such as Alternaria leaf spot, Bacterial blight, Cercospora yellow spot and Red spider Mite [6] it is still source of nutrients about 500 million people in parts of Africa and Latin America, representing 65% of total protein consumed, 32% of energy [5]. The yearly global bean production is approximately 12 million metric tons, with 5.5 and 2.5 million metric tons alone in Latin America Caribbean (LAC) and Africa, respectively [7].Faba bean is grown throughout Ethiopia and is increasingly an important commodity in the cropping systems of smallholder producers (the average farm size for smallholder farmers is between 0.25 to 0.5 hectares) for food security and income. It has also health benefits because it is rich in protein content (about 23% for dried shelled beans and about 6% for green beans) and serving as a good source of iron and zinc (both of which are key elements for mental development). According to the report of central statics agency in the year 1 2016 the area covered by bean production in Ethiopia was 113,249.95 ha and 244,049.94 ha for white and red bean respectively with total area of 357,299.89 ha and total production of about 540,238.94 tons/ha and national average yield was 1600 kg/ha. It is mainly grown in Eastern, Southern, South Western and the Rift valley areas of Ethiopia[5]. The production constraints reported in the literatures for beans are poor agronomic practices, soil infertility, lack of improved cultivars, moisture stresses, weed competition, and damage caused by pests and diseases [8][9]. Rust (Uromycesappendiculatusa (Pers., Unger), anthracnose (Colletotrichum lindemuthianum (Sacc.) Magnas), angular leaf spot (Phaeoisariopsisgriseola (Sacc. Ferr), web blight (Rhizoctonia solanipv. phaseoli (Kuhn.), root rots (Fusarium solanipv. Phaseoli (Mart.) Sacc bean common mosaic virus (BCMV) and bean golden mosaic virus (BGMV) are also the major diseases identified and cause considerable yield reduction in Ethiopia. Bacterial blight (BB) is a significant seed borne disease of bean, caused by the gram-negative bacterial pathogen Xanthomonas axonopodispv. phaseoli (Xap) and its fuscans variant Xanthomonas fuscans subsp. fuscans (Xff) [7]. Bacterial blightaffects foliage, pods and seeds of faba bean and is considered as the major problem in most faba bean production areas of the world. During extended period of warm and humid weather, the disease can be highly destructive and causes losses in both yield and seed quality of bean in many production areas of Ethiopia [5]. Bean bacterial blight is reported as the main obstacles to faba bean production throughout the country. However, prevalence varies with growing area and seasons. For instance, for each percent increase in Bacterial blightseverity in broadcast and mixed intercropping, about 5.2 kilogram per hectare (kg ha-1) and 9.1 kg ha-1 seed yield losses, respectively, occurred at physiological maturity of the crop in Hararghe, eastern Ethiopia. At flowering, for each percent increase in bacterial blightseverity, there is 38.8 kg ha-1 and 71.1 kg ha-1 yield reduction in pure stand and row intercropping system respectively, in this area [7].Therefore, because of the above bean crop usefulness for the country, bacterial blight disease attack level, difficulty of disease detection and farmers loss of energy to detect bean diseases we proposeda solution for the above problems. We have detected this crop disease by using Support Vector Machine of image processing. 2 1.2. Statement of problem In Ethiopia the productivity level is declining because of diseases on the crops[10]. Many crop diseases can reduce the production of agriculture causing a tremendous amount of losses for farmers. It is difficult to detect these diseases with human eye. Now a day there is no a best way to detect the diseases of the crops rather than observation or guessing the diseases from previous symptoms if occurred before in other crops or areas. Therefore, rather than losing economy, energy and time in trying to identify bean crop disease using naked eye it is a solution to detect its type and notify to the farmers the resulted gained by using machine learning. Even if some disease is visible by humans, it is not an easy task to detect and classify the bean crop disease; as it requires a continuous monitoring which can be exhausting and expensive. Moreover, it is not good to wait till the time the symptoms are visible so as to take some actions in treating them. It might be too late to act. The problems described above motivate us to use image processing technique to resolve these issues. We came up with this research due to the factors of the above stated bean crop disease and believed that there must be a solution that detect the disease early with an automated system that identify the level of the crop leaves. Besides the above obstacles walk up us to investigate on themethod that could be used to detect bacterial blight, Alternaria leaf spot and halo blight disease of bean crop.We go through also to add one additional class with different bean crop diseases and to detect the healthy bean crop as healthy not like the works done on the research paper “A Novel Approach to Classify and Detect Bean Diseases based on Image Processing” [11].Since beans are vital to the existence of both humans and animals, farmers should be supplied with the best modern technologies. These technologies should be capable of identifying and classifying a wide variety of bean diseases in a short time. Detecting bean diseases at early stages can reduce the amount of crop losses significantly [11]. Image processing is necessary in these cases. The algorithms used in image processing make it possible to detect bean crop diseases automatically. Research questions The research answers the question below. How to use a machine learning algorithm to classify bean crop disease? 3 1.3. Objectives 1.3.1. General objective The main objective of this research is to increase the healthy crop production of the country, Ethiopia, by detecting bean crop diseases using Machine learning algorithms. 1.3.2. Specific objectives Design a model to detect disease of bean crop. To analyze the images and accurate results in detecting the bean crop disease To evaluate the performance and the result of the selected models To report the result of the study and recommend future research works To highlight and sensitize farmers and/or small-scale seed enterprises on possible disease 1.4. Scope and limitation of the research 1.4.1. Scope of the research The thesis is mainly focused on the design and development of bacterial blight, alternaria leaf spot and halo blight disease detection model on bean crop. 1.4.2. Limitation of the research The research is true for specifically with Ethiopian bean crop leaf disease detection. It is good to make the research applicable for each and every farmer with GSM and GPRS networks but the research resultis carried out only in the laboratory. It also require an expert who is trained in each and every disease type of bean and image processing. 1.5. Significance of the study Identification of bean crop disease with image processing in accordance with machine learning algorithm computer vision is a good technology to detect bean crop diseases. Because the physically bean crop is many in number in a farm for detection of the disease affects it by using a necked eye, it needs a lot of time and effort and at the same time, that is less accurate and applied 4 in a limited area. Whereasautomatic disease identification techniques and methods which can be deployed digitally are used it takes to less time, less effort, more accurate, and covers a large area. This research paper will enable agriculture experts to increase the value and the importance of computer vision in the field of agriculture. It will also help to harvest more amount of crop due to the fact that the disease is detected early without finding agricultural experts and to decrease the cost of experts for continues caring of crops in a very large farm. The outcome of this research paper will also help different authorities to provide proper measures in situations where there is Bacterial blight, Halo blight and Alternaria leaf spot disease. Finally, this thesis will serve as reference material for the researchers who will conduct their research in computer vision especially researches related to crop disease identification. 1.6. Research organization Besides this chapter the research paper is organized as stated below. The second chapter focus on reviewing related literatures that helps to get additional inputs and strength the idea driven from me. The third chapter is about research methodology that describes the different methods and techniques to be followed in order to achieve the work. In the fourth chapter we have focused on the proposed model structure that describes the system architecture. Here the research paper shows and describes the actions flows with how bean crop disease detection havebeen implemented. The fifth chapter shows the implementation of the research in accordance with the problems reviewed and identified, the objective and the proposed methodology in checking up of the algorithms selected to identify the disease. Lastly the research paper is enclosed by giving conclusion and recommendation based from the general outcome of the research the future thinking of the researchers. 5 CHAPTER TWO LITERATURE REVIEW 2.1. Introduction When using digital image processing, the systems of machine vision starts from image acquisition. After the images are captured there are a number of processes that the system follows to reach the desired goal of a machine vision system. Researchers used machine learning algorithms (such as SVM, KNN, ANN) to detect and classify plant diseases. SVM is called discriminative classifier as formally defined by a separating hyper plane. It also finds separators with maximum margin to improve the performance of the classifier [12]. K-means algorithm is used for segmenting images of the diseases alternaria leaf spot, bacterial blight and cercospora yellow spot only. Diseases with unique spots like a webbed spot of spider mite cannot be segmented using K- means algorithm [13]. It is a kind of self-adaption search algorithm based on partition. It can segment the image with different color, and divide the different part into different clusters. By using K-means clustering algorithm we can dispose the different parts conveniently [14]. Several research papers regarding plant disease detection are explained briefly that can be classified into two main categories. The first category focuses on detecting specific diseases on a certain plant or a group of plants. In addition, some of the algorithms needed to implement each step of image processing. On the other hand, the second category describes the main steps needed for detecting plant diseases. A description of several techniques that are currently used in detecting plants diseases provided in the research paper "Detection and classification of plant leaf diseases using image processing techniques: A review"[15]. The implemented system would determine if the plant is healthy or not, the disease name, and percentage of the infected areas in the leaf. The authors did not focus on a certain plant or disease. In fact, their main aim was to increase the accuracy in detecting plants diseases. In order to do so, they used a non-linear classifier called SVM. The author M. T. [16] demonstrate the image processing steps that should be used for detecting plant diseases. These steps include image acquisition, image preprocessing, image segmentation, feature extraction, and classification of diseases. They stated also an explanation of different algorithms for implementing image segmentation, feature 6 extraction and classification. These algorithms include k-means clustering, color co-occurrence, and neural network. According to the research paper titled “Agricultural plant leaf disease detection using image processing” [17], explained there is a problem in choosing the best classification technique for detecting plants diseases. This is because each classifier has given different result for different type of input data. Therefore, several classification techniques are explained in details along with their advantages and disadvantages. From the researcher’s point of view, although k-Nearest Neighbor is the simplest classifier among all of them, it takes a long time in making predictions and it can be affected by irrelevant parameters. A survey of different approaches in detecting plants diseases were mentioned in the paper “A novel approach for the detection of plant diseases” in IJCSMC,2016[18]. The researcher’s purpose was to explain the algorithms used for each step of image processing. The aforementioned algorithms regarding segmentation were Kmeans clustering, Otsu method, and converting RGB image to HSI model. While, CoOccurrence and mRMR methods were used for feature extraction. The authors mentioned Knearest neighbor, ANN, fuzzy logic, and SVM as techniques used for classifying the detected disease. A general procedure in how to detect leaf diseases has been provided in the paper called "Plant disease detection using image processing”[19]. The image should be preprocessed to extract the useful information from it. Therefore, color transformation is needed to convert the RGB image, which is a color generator, into HSI, a color descriptor. Consequently, the green pixels should be masked and removed and then the image should be segmented. After segmenting the image, a Spatial Gray-level Dependence Matrices (SGDM) method has been applied to extract the texture features of the leaf before passing it into a classifier. The research paper written by M. T. [16] did not specify an approach that can be applied to detect a specific type of plant or disease but we have specified. The authors researched as "Detection and classification of plant leaf diseases using image processing techniques: A review”[15] did not focus on a certain plant or disease rather than discussing some disease and plants. The researchers did not focus on the dangerous bean crop diseases, bacteria blight, Alternaria leaf spot and Halo blight that attack bean and minimize the productivity of the country, Ethiopia, but we focused on it. Therefore, here we researched a solution for Bacteria blight, Halo blight and Alternaria leaf spot crop disease which is not considered for bean crops before by the other local or international researchers. We have 7 been focused on detecting the diseases type and calculate the percentage of the bean crop leaf. Basically, the research has classified the bean crop in to two as diseased (bacteria blight, halo blight and Alternaria leaf spot) and healthy with increasing the accuracy of the classifier. 2.2. Bean Diseases Bacteria infect the seed either by passing through the vascular system of the pedicel of the pod, the vascular tissues of the pod, and then the funiculus, or by growing from an external infection of the pod through the parenchyma, and into the conducting tissues. Infections of the seed in either case are usually not deep, but primarily in the surface to subsurface regions of the cotyledons. Bacterial Blight It is one of the most severe diseases of the pomegranate and bean and caused by the bacteria. It shows up to 100% severity in some orchards. The symptoms can be initially found on stem part which gradually impregnate to leaves and later to fruits. On fruits brown-black spots appear on peri-cap with cracks passing through those spots. It spreads as the bacteria survive on the tree, on the diseased fallen leaves, to the healthy plants in the area through wind splashed rains and infected cuttings. High temperature and relative humidity favor the disease[20]. It is also a typical leaf spot and leaf blight disease. At first, the lesions on leaves are small, translucent, water-soaked spots which later develop dry brown centers and narrow yellow halos. Lesions coalesce into irregularly-shaped areas which may include the whole leaflet. Lesions on stems and pods tend to be more restricted and sunken. Those on the pods frequently turn from red to brown with age. With continued development, the vascular system may also turn brown and surface cankers form on the stem. This bean disease cannot be diagnosed, with certainty, in the field. This disease must be distinguished from four other bacterial diseases of bean which have overlapping symptoms. This problem of differentiating different diseases from overlapping symptoms is commonly encountered in the positive diagnosis of plant diseases. Common bacterial blight symptoms exhibit a scalded appearance on leaf tissue and contain water-soaked spots. These small, irregularly shaped lesions often enlarge to 1 inch or more and form dark brown lesions along the edge of the leaflet. A narrow lemon-yellow margin often 8 surrounds these lesions and large portions of the foliage can be infected. Infected pods exhibit circular, water-soaked areas that often produce yellow masses of bacterial ooze. Later, spots dry and appear as reddish-brown lesions. Pod infection often causes discoloration, shriveling and bacterial contamination of seeds; however, some seed may appear healthy[21].Bacterial Blight is characterized by small, pale green spots or streaks appeared as water-soaked. The lesions will expand then appear as dry dead spots[22]. It may extend until the full length of the leaf. Bacterial blights, caused by various species of bacteria, occur in most of the bean growing areas of the world[23]. Under favorable weather conditions, these bacteria can spread rapidly through a field causing defoliation and pod damage. Bacterial blight (BB) is a significant seed borne disease of Faba bean, caused by the gramnegative bacterial pathogen Xanthomonas axonopodispv. phaseoli (Xap) and its fuscans variant Xanthomonas fuscans subsp. fuscans (Xff). Both strains cause identical symptoms but Xanthomonas phaseoli var. fuscans has been reported to be more aggressive[24]. In Ethiopia, it is ranked among the most important and wide spread diseases of Faba bean. It also reported as the main constraints to Faba bean production throughout the country. It is caused by Xanthomas axonopodisPv. Punicae bacteria[25]. It shows its presence over leaves, stem as well as on fruits and reduce nearly 65% to 75% of yield and this disease is not so easily curable by any of the antibiotics or by any chemicals, it needs a periodic day to day observation. This only helps in finding infection in the early stages and then farmer can easily take a precautionary measure to make the plants to overcome from the infection. Halo blight The researcher K.W. described different bean disease that are frequently attacking the crop as Halo blight, Fuscous blight and Bacterial Brown Spot[23]. The first one, Halo blight, is caused by Pseudomonas phaseolicola, characteristically exhibits a halo. The disease, however, cannot be distinguished from common blight on the basis of its large halo for the halo is not produced if there have been periods of very warm weather. Halo blight symptoms first appear as small, angular, water-soaked spots (almost resembling little pin pricks) on the undersurfaces of leaves. As these spots grow and turn brown, a characteristic light green to yellow halo appears around the spots. This halo is due to the action of a toxin produced by the bacteria and is a diagnostic symptom of the disease. 9 Fuscous blight This is the second disease stated by [23] and it can be distinguished only by the brown pigment that the causal organism, Xanthomonas phaseoli var. fuseans, produces on certain media such as PDA. Bacterial Brown Spot This disease is caused by Pseudomonas syringaepv. Syringaewritten by the researcher K.M [23], is more common on lima beans than other bean types. Small, water-soaked spots on leaves become red-brown in color. Spot centers dry out, turn grey, and may fall away. Veins on the underside of the leaves may turn red-brown. Spots on stems and pods are more elongated than those on leaves. 2.3. Digital Image Processing Crop disease detection using image processing is a useful method to reduce the crop diseases. Multiple methods of image processing are used to detect the diseases. The researchers with titled “Crops Disease Diagnosing Using Image-Based Deep Learning Mechanism” [26] in 2018 used an approach based on convolution of neural networks to classify the disease of strawberry plants. The system makes use of deep learning to diagnose the disease. The researchers O. Min and N. Chi Htun[27] in the same year used image processing techniques to detect and classify four types of plant diseases which are Rust, Cercospora Leaf Spot, Bacterial Blight and Powdery Mildew. In the research “Disease detection in crops using remote sensing image”in 2017 [28] used remote sensing images to early detect the crop diseases. Canny edge detection & histogram matching was used. In 2016 [29] classified six different diseases of tomato plants using image processing techniques. These image processing techniques extract features from images of healthy and diseased plants. Digital image processing is the use of computer algorithms to perform image process on digital pictures[30]. It permits a far wider vary of algorithms to be applied to the computer file and might avoid issues like the build-up of noise and signal distortion throughout process. Digital image process has terribly important role in agriculture field. it's widely adaptedto observe the crop disease with high accuracy. Detection and recognition of diseases in plants mistreatment 10 digital image method is extremely effective in providing symptoms of characteristic diseases at its early stages. Plant pathologists are analyzed the digital pictures mistreatment using digital image process for diagnosing of Crop diseases. According to the research paper “A Brief Review on Plant Disease Detection using in Image Processing”[30] Computer Systems area unit developed for agricultural applications, like detection of leaf diseases, fruits diseases etc. altogether these techniques, digital pictures are collected employing a camera and image process techniques are applied on these pictures to extract valuable data that are essential for analysis. These diseases are mostly on leaves and on stem of plant. The diseases are viral, bacterial, fungal, diseases due to insects, rust, nematodes etc. on plant. It is important task for farmers to find out these diseases as early as possible. Image processing is a form of signal processing for which the input is an image and the output of image processing may be either an image or a set of characteristics or parameters related to the image [31]. Most image-processing techniques treat the image as a two-dimensional signal. Image processing is computer imaging where application involves a human being in the visual loop. In other words, the images are to be examined and are acted upon by people. The research papers "SVM Classifier Based Grape Leaf Disease Detection", “Detection of Leaf Diseases and Classification using Digital Image Processing”, "A Survey on Detection and Classification of Rice Plant Diseases", “Image based Plant Disease Detection in Pomegranate Plant for Bacterial Blight”,” Image based Plant Disease Detection in Pomegranate Plant” and ”leaf disease detection and fertilizer suggestion”developed the system that detect plant diseases with having two phases- Training phase which includes test image acquisition, test image preprocessing, feature extraction, segmentation, classification and calculation of percentage infection and test phases. Accordingly, the raised the main characteristics of crop disease detection using machine learning algorithms that must be achieved as speed and accuracy[32][33][34][35][25][36]. 2.3.1. Image processing methods There are two methods which are used to process image as stated below on the research paper “Overview of Image Processing”[31]. 11 A. Analog Image Processing Analog image processing as an image processing task conducted on two-dimensional analog signals and has the capability of the alteration of image through electrical means like the television image and used for the hard copies. In creating images using analog photography, the image is burned into a film using a chemical reaction activated by controlled exposure to light. Analog images are processed in a darkroom, using special chemicals to create the actual image. B. Digital Image Processing Digital image processing is the use of computer algorithms to perform image processing on digital images. Because of digital image processing we are beneficial inconstant high quality of the image, a low cost of processing and the ability to manipulate all aspects of the process and the image is stored as a computer file. The stored file is translated using photographic software to generate an actual image. The advantages of Digital Image Processing methods are its versatility, repeatability and the preservation of original data precision. 2.3.2. Fundamental Steps of Digital Image Processing In the research written by S. D et al[37] discussed about the main steps of image processing to detect disease in plant and classify it. It includes the steps image acquisition, image preprocessing, image segmentation, feature extraction and classification. For segmentation, they used the methods otsu’s method, converting RGB image into HIS model and k-means clustering. According to them k-means clustering method gives accurate result. After that, feature extraction is carried out the features color, texture, morphology, edges etc. Among this, morphology feature extraction gives better result. After feature extraction, classification is done using classification methods like Artificial Neural Network and Back Propagation Neural Network.The researchersM. C. Ghulam and G. Vikrant[38] introduced an advanced system for detection of plant disease. The researchers aim at the design and development of image processing-based software for automatic classification and detection of disease in plants. In this research paper detection of the disease is done on two distinct classes of disease like scorch and spot. Algorithms are designed for segmentation, feature extraction, classification and detection of disease. One of the drawbacks of the technique used in this paper is that it can be implemented 12 only in controlled laboratory condition. It has good adaptability for different color spaces, but it yields poor segmentation results on the tested images. In the research paper named “Image Processing System for Plant Disease Identification by Using FCM-Clustering Technique” implemented a method for plant disease identification using the FCM (fuzzy C-means) clustering technique. Segmentation is done by using FCM clustering technique [39]. Features are extracted from affected regions and passed to the SVM (support vendor machine) classifier for classification. The combination of classifier technique is used in this paper can classify diseases efficiently, but takes more processing time. Hence main drawback of this paper is early detection of disease is not possible. In the research paper “Petiole detection algorithm based on leaf image” [40] implemented a method for detection of unhealthy region of plant leaves using image processing and genetic algorithm. Genetic algorithm is the iteratively formed evolutionary algorithm for generating solutions to analytical problems. The algorithm begins with a set of solutions called a population. Solutions from one population are chosen and used to form a new population. This paper can extract features of the disease from the segmented part efficiently, but it takes more time to handle multiple iterations of the sample input. It has low execution speed since it takes more training time. In the research “Plant Disease Detection Techniques: A Review”[41] states the process of plant disease detection system basically involves four phases as shown in Fig 2-1. The first phase involves acquisition of images either through digital camera and mobile phone or from web. The second phase segments the image into various numbers of clusters for which different techniques can be applied. Next phase contains feature extraction methods and the last phase is about the classification of diseases as shown below. 13 Figure 2-1: Phases of plant disease detection system In the research titled “Detection and classification of plant leaf diseases using image processing techniques: A review” [42] proposed the steps of image processing from image preprocessing as shown below in Fig 2-2 by missing the image acquiring step but the researchers must get the images with image acquiring step. Figure 2-2:Block diagram for Image Processing We proposedthat it is mandatory to include the image acquiring step in classification of the images with respect to the diseases attacked them with the steps as shown below in Fig 2-3. 14 Figure 2-3:Framework of image processing operation 1. Imageacquisition Theimages of the bean crop leaves are captured through the camera[25]. The captured images were in RGB form. Scaling of an image and color transformation of image, if required, has been done in image pre-processing. Image acquisition in image processing can be broadly defined as the action of retrieving an image from some source, usually a hardware-based source, so it can be passed through whatever process need to occur afterward[43]. Performing image acquisition in image processing is always the first step in the workflow sequence because, without an image, no processing is possible. The images acquired was completely unprocessed and is the result of whatever hardware has been used to generate it, which can be very important in some fields to have a consistent baseline from which to work. One of the ultimate goals of this process is to have a source of input that operates with in such controlled and measured guidelines that the same image can, if necessary, be nearly perfectly reproduced under the same conditions so anomalous factors are easier to locate and eliminate. 2. Image Pre-processing Pre-processing is a technique used to analyze real time problems in images[44]. Inspection is only way to investigate the disease present within fruit. In order to accomplish this image of leaf 15 can be captured and then analyzed using pre-processing techniques. The preprocessing technique utilized for this purpose of converting RGB to different color space conversion. Different preprocessing techniques such as image cropping, resizing, color transformation, contrast enhancement and filtering is done for removing noise and enhancing images in dataset[45]. For a better outcome for segmentation steps, we concentrate on enhancing the image of bean crop leaf in order to improve the image colors or image intensities which help emphasize the texture and disease color. This is useful for segmentation step that use the colors’ intensity as an attribute. In the study, preprocessing is performed the image resizing to minimize the image size and reduce the use of memory in processing system. Histogram equalization is used in order to adjust image intensities and to enhance contrast. Doing this, we have obtained an image with clearer edge of leaf and diseases that have been occurred. 3. Image segmentation In this research the method used for segmentation is k-means clustering algorithm. K-means clustering is one of the unsupervised machine learning algorithms use to classify or categorize datasets into groups. K-means clustering is an iterative, data-partitioning algorithm that assigns n observations to exactly one of k clusters defined by centroids[46]. Image Segmentation also aims at simplifying the representation of an image and it becomes more meaningful and easier to analyze [47]. As the premise of feature extraction, this phase is also the fundamental approach of image processing. There are various methods using which images can be segmented such as kmeans clustering, Otsu’s algorithm and thresholding etc. The k-means clustering classifies objects or pixels based on a set of features into K number of classes. The classification is done by minimizing the sum of squares of distances between the objects and their corresponding clusters [48]. Image segmentation can play a vital and important role in plant disease detection[34]. Image segmentation means to divide the image into particular regions or homogeneous objects. According to the research titled “A Survey on Detection and Classification of Rice Plant Diseases”, the primary aim of segmentation is to analyze the image data so one can extract the useful features from the data. There are two ways to carry out the image segmentation: (1) based on discontinuities and (2) based on similarities. In the first way, an image is partitioned based on sudden changes in intensity values, e.g., done via edge detection. While in the second way, 16 images are partition based on the specific predefined criteria, e.g., thresholding done using Otsu’s method. [49] has taken a number of crop types namely, fruit crops, vegetable crops, cereal crops and commercial crops to detect fungal diseases on plant leaves. Different methods have been adopted for each type of crop. In the research paper titled “Detection of Plant Disease Using Threshold, K- Mean Cluster and ANN Algorithm” by IEEE,2017 [42] States different points for each and every type of crops as such below. For fruit crops, k-means clustering is the segmentation method used, texture features have been focused on and classified using ANN and nearest neighbor algorithms achieving an overall average accuracy of 90.723%. For vegetable crops, chan-vase method used for segmentation, local binary patterns for texture feature extraction and SVM and k-nearest neighbor algorithm for classification achieving an overall average accuracy of 87.825%. The commercial crops have been segmented using grab-cut algorithm. By using Mahalnobis distance and PNN as classifiers with an overall average accuracy of 84.825% a wavelet-based feature extraction has been adopted. The cereal crops have been segmented using k-means clustering and canny edge detector. Color, shape, texture, color texture and random transform features have been extracted. SVM and nearest neighbor classifiers used to get an overall average accuracy of 83.72%. Here on we have researched to increase the accuracy of crops in the research paper titled “Detection of Plant Disease Using Threshold, K- Mean Cluster and ANN Algorithm”[42] with texture feature extraction. For this research paper we have selected K-means clustering to segment the bean crop leaf mages. K-means clustering is used for segmenting an image into three groups [50]. The clusters contain diseased part of leaf. Before clustering ‘a’ component is extracted from L*a*b space. And its Properties of K-Means Algorithm and K-Means Algorithm Process are given as below: 1) Properties of K-Means Algorithm a) There is K number of clusters always. b) There is minimum one item in each of the given cluster. 17 c) The clusters never overlap with each other. d) Each member of single cluster is nearer to its cluster than any other cluster. 2) The Process of K-Means Algorithm a) First divide the dataset into K number of clusters and assign the data points randomly to the clusters. b) Then for each data point, calculate the Euclidean distance, from the data point to every cluster. The Euclidean distance is the straight-line distance between two pixels and is given as follows: Euclidean Distance=√((x1-x2)² + (y1 -y2)²) ------------------- (1) Where (x1, y1) & (x2, y2) are nothing but two-pixel points (or two data points). c) If the data point is closest to its own cluster then leave it where it is. d) Shift it into the nearby cluster, if the data point is not closest to its own cluster. e) Repeat all steps until an entire pass through all the data points. f) Now the clusters become stable and the process of clustering reached final step. 4. Feature extraction It is the process of determining common features and then group or clusters are formulated from which particular values can be extracted to reduce the complexity. In image processing SVD approach is commonly used for feature extraction purpose [44]. According to [51] a web-based tool has been developed to identify fruit diseases by uploading fruit image to the system. Features extraction has been done using parameters such as color, morphology and CCV (color coherence vector). Clustering has been done using the k-means algorithm. SVM is used for classification as infected or non-infected. The researchers work achieved an accuracy of 82% to identify pomegranate disease.The feature extraction aspect of image analysis focused on detecting essential characteristics or features of objects present within an image [34]. These features can be used to describe the object.Generally, features under following three categories are extracted: color, shape, and texture. The researchers assumed that color is an important feature because it can differentiate one disease from another. Furthermore, each disease may have different shape; thus, system can differentiate diseases using shape features. Some shape features are area, axis, and angle. Texture means how color patterns are scattered in the image. The feature extraction of disease sections are extracted in to the 18 categories: color, texture and shape features. Since each type of diseases presented in different color and shape properties, the color and shape features are further used in classification. The feature extraction is used to extract the information that can be used to find out the significance of the given sample. The main types of features are shape, color and texture which are mostly used in image processing technique. We are extracted the texture features (Statistical based Feature Extraction) of the bean crop leaf to get full features and prepared for classification with less ambiguity. 5. Classification Classification is the process to find the feature of images and group them into specific classes. We used Support Vector Machine (SVM), which is a supervised machine learning algorithm [52][53]. It is capable of classifying in high dimensional spaces effectively, working on small dataset and dealing with non-separable data by combining the technique called Kernel function. By using Kernel function, we are mapped the feature space to a high dimensional feature space where the data vector became linearly separated so that we can later find the hyper plane to separate the dataset in high dimensional feature space. Kernel functions such as polynomial, radial basis function (RBF) and Sigmoid function, are usually used with SVM and it is a wellknown machine learning algorithm. SVM is also used to classify data sets into specified categories[54]. It is a discriminative classifier formally defined by a separating hyper plane. This method finds separators with maximum margin to improve the performance of the classifier. The Kernel function used in SVM is defined as the mathematical formula use to transform inseparable data set from the input data plot to a lower or higher dimensional space that results in separable results [46]. For classification purpose [55] used the SVM classifier to identify the classes, which are closely connected to the known and trained classes. The Support vector machine creates the optimal separating hyper plane between the classes using the training data. Many classifiers have been used in the past few years by researchers such as k-nearest neighbor (KNN), support vector machines (SVM), artificial neural network (ANN), back propagation neural network (BPNN), Naïve Bayes and Decision tree classifiers. According to [42] also the most commonly used classifier is SVM. Though every classifier has its advantages and disadvantages, SVM is simple to use and robust technique. Image processing techniques could be applied on various applications as follows [30] : 19 1. To detect plant leaf, stem, crop and fruit diseases. 2. To quantify affected area by disease. 3. To find the boundaries of the affected area. 4. To determine the color of the affected area 5. To determine size, texture and shape of fruits. 2.3.3. Types of image processing The researchers did the research paper ”Automatic Flower Disease Identification Using Image processing”[43]states the types of image processing as Low level, Mid- level and high-level image processing and described as below. Low- level Low level processes involve primitive operations such as image pre-processing to reduce noise, contrast enhancement, and image sharpening as it is shown in Figure 2-4. It is characterized by the fact that both its inputs and outputs are images. Figure 2-4: Low level Image processing Mid- level The tasks included in mid-level processing of images are segmentation (partitioning an image into regions or objects), description of those objects to reduce them to a form suitable for computer processing, and classification (recognition) of individual objects. The input of this step may be images from first level processing or images that are directly captured. This level can be characterized by its inputs generally are images, however its outputs are attributes extracted from those images (e.g., edges, contours, and the identity of individual objects). 20 Figure 2-5:Middle level image processing High-level Higher-level processing involves “making sense” of an ensemble of recognized objects, as in image analysis, and, at the far end of the continuum, performing the cognitive functions normally associated with vision and, in addition, encompasses processes that extract attributes from images, up to and including the recognition of individual objects. The processes of acquiring an image of the area containing the text, pre-processing that image, extracting (segmenting) the individual characters, describing the characters in a form suitable for computer processing, and recognizing those individual characters are in the scope of what we call digital image processing.Today, there is almost no area of technical attempt that is not affected in some way by a digital image processing. The application areas of a digital image processing are different. One of the simplest ways to develop an understanding of the extent of image processing application is to categorize images according to their original source. The principal energy source for images in use today is the electromagnetic energy ambit. Other important source of energy includes acoustic, ultrasonic, and electronic (in the form of electron beams used in electron microscopy). Thus, imaging techniques based on this source of energy includes gamma-ray imaging (nuclear medicine and astronomical observation), X-ray imaging(medical diagnosis and astronomy), imaging in the ultraviolet band(lithography, industrial inspection, microscopy, lasers, biological imaging and astronomical observation), imaging in the visible and infrared bands(light microscopy, astronomy, remote sensing, industry, and law enforcement), imaging in the microwave band(radar system) and imaging in the radio band(medicine and astronomy). 2.4. Related sources In order to accomplish the objectives of the research, literatures on contemporary development of machine learning algorithms related to cereal, plant and fruit classification are reviewed. The 21 literatures reviewed are concerned with image processing with different machine learning algorithms by SVM, KNN, ANN and many others. All of the literatures are basically done with image processing, deep learning and machine learning to detect and classify Crops, Plants, flowers and vegetables. We used the sources from different research papers as input for our work with giving siting them 22 CHAPTER THREE RESEARCH METHODOLOGY This chapter clarified about the exemption method or approach used and required resources. In order to make this research successful with respect to its objective, literatures on contemporary development of image processing related to plants, cereal or fruit classification was reviewed. From these insight reviews of image processing techniques using a machine learning and tools that were employed on agricultural products variety identification and that were pertinent to this work have been selected. The methodology for detecting bean crop leaf diseases involves several tasks, such as Image acquisition, image preprocessing, image segmentation, feature extraction and leaf diseases classification based. Firstly, acquiring image in which the images of the various bean crop leaves that are to be classified are taken using a digital camera. Secondly, image preprocessing was applicable to remove noises. In the third phase, segmentation is performed to discover the actual segments of the leaf in the image. At the fourth level, feature extraction for the infected part of the leaf is completed based on specific properties among pixels in the image or their texture and side by side certain statistical analysis tasks calculated to choose the best features that represent the given image, thus minimizing feature redundancy. Finally, classification is completed using support vector machine. 3.1. Experimentation Tools There are several things that should be considered to make sure the development stage of the system can run successfully such as software and hardware specification. This software is developed by Math Works. We have selected MATLAB version R2015bto implement the prototype of the system with other libraries that is compatible with the simulator. Visio 2016 was also used for designing the system architecture, algorithms and SVM classifier. In addition to the above software’s we have used Adobe photoshop cs4 for image formatting. We have applied also mobile phone withmodel TechnoW5 having digital camera, 13 Mega Pixel is used to collect bean images. 23 3.2. Algorithm The research used the algorithm called K-means cluster algorithm for segmentation and Support Vector machine for classification which was easy to detect the bean crop diseases MATLAB software. The code designed in MATLAB consists of two major functions, which are generating training model and test Data. 3.3. Analysis and Design Analysis is the process determining the needs or conditions to meet for a new or altered the system. Design is the process of problem solving and planning for a software solution. It includes low-level component and algorithm implementation issues as well as architectural view. There is a growing demand of image processing in diverse application areas, such as multimedia computing, secured image data communication, biomedical imaging, biometrics, remote sensing, texture understanding, pattern recognition, content-based image retrieval, compression and so on. For our designing purpose we analyzed the data that are collected from different resources with respect to the diseases that we wanted to identify in accordance with the aim of the research. 3.4. Data Collection and Dataset Preparation 3.4.1. Data Collection We have collected bean crop leaf images from Ethiopian Institute of Agricultural Research (EIAR), Debre Zeit center through the techniques observation and existed resources. In addition to the image from the institute another samples (healthy and infected leaf images) are collected from the Web. This can benefit the model to train with different imaging properties and conditions. 3.4.2. Dataset Preparation Data preparation is required to train and test the model. From the collected images manually classified and labeled in training set and randomly selected, unclassified and unlabeled image data in testing set are prepared. The images in testing set are different from the images that are 24 included in the training set. From the collected total 100 images 80 (80%) samples for training and 20 (20%) samples are used. 3.5. Sampling techniques Sampling is one of the core procedures in classification and detecting disease. For sampling, we have selected the faba bean by taking sample of images of faba bean. We took faba bean sample because of its availability and functionality for the country, Ethiopia and it is more attacked by the disease. From our samples, major samples were used for training and remaining which is less compared to the training from total were used for testing purposes. The training sample is the composition of the images healthy and Bacterial blight, halo blight and Alternaria leaf spot diseased faba beans. 25 Literature review Sample collection Image Acquistion Image preprocessing Image segmentation Feature extraction Image classification Figure 3-1: Methodology of sampling 3.6. Materials and methods When images have been taken, the camera was mount on a stand which provides easy vertical movement and stable support for the camera. Samples were arranged on a black background table during image recording. The diseased beans were scattered on the table, each making no contact with another. The separation between diseased beans was kept in order to make image segmentation easier. To obtain uniform lightning or balanced illumination, an incandescent lamp whose light source was 100W with a rated voltage of 220V was used in all experiments. The 26 lighting system was switched on for about 5 minutes prior to acquiring any images for its stabilization. In-order to reduce the influence of surrounding light, we took the samples in a controlled room. The images were taken at resolution of 2818 x 1826 pixels and resized to 256 x 256. 3.7. Evaluation Technique The research model has been assessed by exploiting a test dataset on the classifier using the training dataset and the model’s performance of the classifiers was returned as an output that contains percentage of accuracy measures for each class. In the research the classifier accuracy and total infected part of the images have been calculated. The system drivesalso the error rate of the classifier with respect to the images classified withindicating the correct/incorrect allotment of samples into their respective classes Figure 3-2: Evaluation metric samples 𝑇𝑃+𝑇𝑁 Accuracy (%) = 𝑇𝑃+𝐹𝑃+𝑇𝑁+𝐹𝑁*100 Error rate (%)= 1- Accuracy (%) ------------(2) -------------(3) Where TP – True Positive, TN – True Negative, FP – False Positive, FN – False Negative In this research case, these representations can be interpreted as: TP: number of infected crop leaves sorted out as ‘INFECTED’ TN: number of healthy crop leaves sorted out ‘HEALTHY’ FP: number of healthy crop leaves sorted out as ‘INFECTED’ FN: number of infected crop leaves sorted out as ‘HEALTHY’ Total: total number of samples (crop leaf images) In this research paper, the efficiency of the proposed methodology is tested and evaluated to detect fababean crop diseases. In order to successfully evaluate the classification accuracy and detect those disease, five steps are implemented: 27 1. Image Acquisition isthe first step to process an image. In this step, the available images from the digital camera or internet have been taken.It is about gathering images for preprocessing. 2. Preprocessing step aims to make the collected image is scaled and applied to a min-max linear contrast stretching to improve the quality of the original image. This was also linearly expand the original value of the data into a new distribution. Then, with transformation structure is built for the enhanced image and create an enhanced image. 3. The segmenting process is carried out using the K-Means Clustering with Euclidean Distance to extract the region of interest from the image. 4. Feature extraction was up on after dividing images into its homogenous parts. Grey Level Co-occurrence Matrix (GLCM) is used for this purpose. Thirteen parameters have been also extracted from the testing and training images. 5. Classification stage is the last stage to classify bean crop leaf, it is good to use a linear classifier which is machine-learning algorithm called Support Vector Machine (SVM). It is chosen among other classifiers since it has high prediction accuracy and it works when there are errors in the training samples. This classifier can be used for many classification types including texture classification. Usually the SVM input is nonlinear however, in some high dimensional space it is mapped into linearly separated data that are resulted in good classification. This classifier works with only two classes divided using the hyper plane in which the distance between the support vector and the hyper plane is as far as possible. However, this does not mean that the implementation of a multiclass classification cannot be done using this classifier [10]. 28 CHAPTER FOUR PROPOSED SYSTEM MODEL The aim of this chapter is to discuss the approach and framework for the project. Method, technique or approach that has been used while designing and implementing the thesis included in the content. 4.1. System Architecture In the figure below the architecture for an implementation of how the system work is depicted. By considering having additional sub-tasks under each main task, these sub-tasks has been explained in the next sections detailly and here it is an overview how the system look like as referenced to the research paper “[43]”. 29 Figure 4-1: Proposed system architecture The architecture shown above tells the overall process followed to classify an input image in either of two classes. According to the architecture training and testing phases should be performed independently. The training phase begins by importing a number of images, which are arranged to process in one after another methodology and independently before the testing phase have done after the training phase finished the process and prepared images for accurate classification by training the machine. In the second phase, testing phase, an image is imported for the process. After image is imported in both phases pass through the same processes that are providing the same purpose. The preprocessing, segmentation and feature extraction functionalities are same for both phases. After the feature extraction both phases follow different paths, the training phase provides feature vector with a label input for the model to train and the result is stored in the knowledge base. The testing phase provides a feature vector to the model 30 and expects for label return classifier returns that label from knowledgebase that is trained previously. The proposed methodology for bean crop image classification has five vital stages; the initial stage is the image acquisition stage through which the real-world sample is recorded in its digital form. In the next stage of the research imagesaretransformed to a preprocessing stage, making use of its size and complexity of the image was reduced. The precise digital information was subjected to segmentation and feature extraction process which separates the rotten portion of the leaf samples. Finally, the area of the segmented part has beencalculated using machine learning algorithm, SVM, and classified to its category. Figure 4-2: Flow chart to classify images 31 4.2. Tasks of Image processing To process an image in detecting fababean crop disease our research followed the following tasks procedurally. 4.2.1. Image Acquisition According to the research paper “A Novel Approach to Classify and Detect Bean Diseases based on Image Processing” [11] the initial process is to collect the data from the source which is selected by the researcher. We gather by our camera and took the images as input for further processing. We have taken most popular image domains so that we can take with the format .jpg as input to be processed. The process can be developed by using a device called camera. The output of this process is a number of captured images of faba bean in the format it is captured by the device. Figure 4-3: Healthy bean and with bacterial blight 4.2.2. Image Preprocessing When the images are acquired from the field and web it may containdin. Therefore, preprocessing is performed to eliminate the din in the image, so as to adjust the pixel values and changing images background as black. It enhances the quality of the image.To remove all noises of images, we can use images filtering and segmentation techniques. The output of this phase is segmented images containing the leaves from the images of the first phase (image acquisition). To remove noise in image different preprocessing techniques are considered. In this research image cropping and image enhancement are used to remove noise of images by cropping of the 32 leaf image to get the interested image regionand increase the images contrast respectively. The Red, Green and Blue (RGB) images are also converted into grey images using color conversion by the following formula: F(x) = 0.2989*R + 0.5870*B + 0.114*B --------------------- (4) The input images, originally having thousands by thousands of dimension, are resized 256x256 pixels contented to the next process and are cropped leaving only the diseased area of the leaf to clear the images. By doing so, the computational time and computing memory power is condensed seeing that only a small portion of the bean crop leaf is processed. Techniques of image preprocessing To clean the noises from the collected images from many sources it is important to follow the following technique[43]. 1) Image Scaling Image scaling is functional because the size of training and testing images are not matching. Some of these images have beenimmense in size that can a basis for a problem in the implementation including out of memory. Dropping the image size can rush the processing time. Therefore, all the image sizes were set to [256,256]. We have selected this process for preprocessing the images in removing of noises. 2) Min- Max Linear Contrast Stretch The input images may have low variance, which can affect the detecting process. That is why using min-max linear contrast stretch is necessary to advance the quality. This is because it reallocates the lowest and the highest values of the data into new set of values that apply the full range of available intensity values. For example, if the lowest intensity value of an image is 45 and the highest brightness value is 205. The values from 0 to 44 along with the values from 206 to 255 have not shown. That is why the lowest value should be stretched to 0 and the highest value should be stretched to 255, which is done by applying the min-max linear stretch. 33 4.2.3. Image segmentation Segmentation is a strategy that divides an image into different screens and distributes them based on the appearance that can be observed in the image such as the color, texture, boundaries and many more [56]. It is established on different appearance found in a picture such as color orientation, texture, boundaries, etc. It is the third step in our proposed method. In this research paper the segmented images are clustered into different segments using k-mean clustering algorithm. Segmentation can be done using various methods like Otsu method, k-means clustering, converting RGB image into HIS model, converting RGB image into Gray level thresholding model etc. [56]. we have selected K-means clustering and before grouping the images by the method, the RGB color model is transformed into contrast enhanced model. The commencement of this model is to easily cluster the segmented images. K-means Clustering Algorithm In k-means clustering, each point from the given dataset is associated to the centroid with the minimum distance repeatedly[43]. In our research paper the distance between the two points are calculated using Euclidean Distance. This is because measuring the distance between any two objects are not reformed if new objects are added to the investigation. To make the algorithm, K –means Clustering operational we followed steps below: 1. Prefer center of K cluster, either randomly or based on some heuristic. 2. Allocate each pixel in the image to the cluster that diminishes the distance between the pixel and the cluster center. 3. Again compute mean of the cluster centers of the pixels in the cluster. Repeat steps 2 and 3 until convergence is achieved. Otsu Threshold Algorithm Thresholding creates binary images from grey-level images by setting all pixels below some threshold to zero and all pixels above that threshold to one. The Otsu algorithm defined in [5] is as follows: i) According to the threshold, Separate pixels into two clusters ii) Find the mean of each cluster. 34 iii) Square the difference between the means. iv) Multiply the number of pixels in one cluster times the number in the other. The infected leaf shows the symptoms of the disease by changing the color of the leaf. Hence the greenness of theleaves can be used for the detection of the infected portion of the leaf. The R, G and B components are extracted from the image. The threshold is calculated using the Otsu’s method.Then the green pixels are masked and removed if the green pixel intensities are less than the computed threshold.The researchers with title called “a survey on detection of disease and fruit grading”different segmentation techniques as describe in table 4-1[58]. Table 4-1: Summary of different segmentation techniques Segmentation Description Benefits Drawbacks Technique It is the simplest method Any prior information It does not work well approach of image about Thresholding Method segmentation by dividing required image Fast, is not for image with broad simple and flat valleys and the image pixels based andcomputationally does on their intensity level. inexpensive. peak. not The threshold value can Can be easily applicable Spatial have any information be computed depending and suitable for real life may be ignored and on the peak of the image applications resultant image cannot histogram. guarantee that the segmented regions are contiguous. Threshold selection is very crucial. Extremely sensitive. 35 noise In this method It is flexible enough to Required construction of choose more between computation time and segmentation region is interactive and memory and sequential Region Based Method based on association and automatic technique for in nature. dissociating neighbor image segmentation. pixels. It works on the principle of homogeneity, with the fact the adjacent pixels inside specific flocks characteristics More clear Noisy seed selection object by user leads to faulty boundaries by the flow segmentation. from the inner point to outer region. region Because of splitting scheme in region to other splitting segments related Compare and methods it gives more seem square. unrelated to the pixel in accurate result. the other region. In this method pixels Homogeneous having similar can be easily obtained. characteristics in image Clustering Method regions Poor Computationally faster. worst-case behaviour. It requires similar size are segmented into same clusters, so the an K-means works faster assignment of the image into different parts for the smaller value of adjacent cluster center based on the features of K. is the correct the image. The k-means assignment. algorithm is commonly clusters. Cluster used for this method. 36 In this method all edges Works Edge Based Method are detected first and images then to segment well with the contrast required region, edges regions. are connected to form the for the Work not well for the better image having more between edges. Selection of right object edge is difficult. object boundaries. It is based on discontinuity Segmentation Method Equation Based Partial Differential detection in edges. These are appropriate fast for and Fastest Method time Computational Complexity is more critical applications. It is based on the differential equation working. 4.2.4. Feature extraction Feature extraction is the important part to stylishly predict the infected region. Here shape and textural feature extraction is done the research paper “Plant disease detection and its solution using image classification”[57]. The shape-oriented feature extraction like Area, Color axis length, eccentricity, solidity and perimeter are calculated. Similarly, the texture-oriented feature extraction like contrast, correlation, energy, homogeneity and mean. Leaf image is captured and processed to determine the health of each plant. The output of this phase is a number of feature vectors corresponding to the segmented images resulted from phase (3). Image features usually include color, shape and texture features. 37 Table 4-2: Summary of different color techniques L*a*b [58] Method Description Merits Demerits a) This color space consists one a) In this color and a) channel for Luminance and two intensity manage singularity other channels are a and b known as individually. other chromaticity layers. transformation. b) It can measure Problem of as nonlinear b) Space consists of dimension L small color for lightness and a and b for color differences. adversary dimensions. HSV Histogram [59] a) HSV can be represented as a) Accuracy is more a) Sensitivity to hexagon in three dimensions in b) Applicable for which intensity can be represented real time as central vertical axis. lighting variations is less. applications. b) It is Hue, saturation value. c) Colors are described in term of shades and brightness a) It is color space based on RGB a) RGB [58] model. suitable display for a) It is correlative. highly So, b) Consists of three independent not good for color image planes, one for each primary image processing color red, green and blue c) It is an additive model 38 a) Main channel luminance a) Overcome the a) Correlation describes the light intensity like rod correlation of RGB exists but less than YUV [58] cells of the retina to some extent and RGB b) Chrominance components U and require V carry the color information less computation time c) In this black and white color information is separated from the color information According to the researchers A.A et al there are different texture feature extraction techniques[60]. Table 4-3: Summary of different texture feature extraction techniques Method Description Merits Feature vector a) Many matrices is used to examine the length is small required to be texture which considers computed the spatial relationship of pixels is the grey Matrices Grey Level Co-occurrence a) It is statistical method a) Demerits level co-occurrences matrix. b) Can be applied for the different b) It’s not invariant color space for color with co-occurrence rotation and scaling matrix Transform Wavelets a) It works better on the a) Best features with a) It is quite complex frequency domain rather the higher accuracy and slower than the spatial domain can be produced 39 Analysis ndependent Component a) It is computational a) method for splitting a statistics multivariate signal into easily obtained additive small order a) It is rarely used can be method. b) It separates mixed subcomponents signal into a set of independent signals. a) It is used to analyze a) specific Gabor filter Higher It is multi a) So many filters are frequency resolution and multi- used in application so content in the image in scale filter overall computational specific directions in a cost is high. localized region around the region of interest b) It is used for orientation, spectral bandwidth and spatial extent Figure 4-4: Conversion of images to R, G and B images 40 4.2.5. Classification The linear Support Vector Machine (SVM) algorithm is used to perform the binary classification on whether an input image is infected with diseases bacterial blight, alternaria leaf spot and halo blight or not. SVM separates bean healthy leaves from diseased and looks for the hyperplane which ensures that the margin between the nearest healthyand diseased is the largest. According to the researchers A.A et al there are different texture feature extraction techniques[60]. Table 4-4: Summary of different classifiers Classifier Description Naive Bayes a) It is Probabilistic classifier Classifier b) Strong Merits Demerits a) Small amount of Interaction between independence training data is assumption theorem required for features learnt can’t be because of c) value of the particular classification independency feature is independent of the among the feature value of any other feature K-nearest a) It is statistical and non- a) Implementation is a) Very Sensitive to neighbor parametric classifier simple noisy or irrelevant b) Weight can be assigned to b) Don’t required data the contributions of the classes to be linearly b) More time- neighbors, so nearer neighbor separable consuming testing donates more in the average process than the distance neighbor requires calculation c) Distance metric has been of distance to all calculated for samples and known instances because classify based on this distance d) It uses Euclidean distance to calculate distance Support a) It is based on the decision a) It is effective in a) Training time is Vector planes that define decision high boundaries. spaces dimensional very high with large data set 41 Machine b) There are two stages of its b) In comparison b) For mapping working with other original 1) off-line process classification high dimension data 2) online process techniques selection of kernel data into c) Multi-class support vector classification function and kernel machine as a set of binary accuracy is high. parameters is vector machine is used for c) SVM is robust difficult training and classification enough, even though training samples have some distortion. Decision Tree a) It repetitively divides the a) Small sized trees a) For some datasets working area into small sub can parts by identifying attributes. be easily it is observed to over its interpreted fit with noisy b) For many simple classification tasks. b) Leaves present the class data sets accuracy is labels and branches present comparable with features that lead to those other classifications classes. Artificial a) It is derived from the a) It is robust and a) Neural concept Network biological neurons system of the human can handle data Requires more noisy training time b) Requires large b) It consists of two datasets b) Well suited to training samples one for training and one for analyze testing numbers complex c) Requires more processing time SVM usually used to recognize an object and brands it with their labels or names based on the given information that is obtained during the feature extraction phase [61][62] . The descriptors of new images are then used for comparison with the descriptors of the images already found in the database to categorize them accordingly to their classification.It is a supervised machine 42 learning algorithm that is based on the concept of decision planes where linearly separable classes can be identified using a hyperplane. Although this classifier takes time when training images, it still does perform well even if the training sample has some bias and is limited. This algorithm is a binomial classification type but can also be applied to multiple classes.It alsois extremely popular around the time they were developed in the 1990s and continue to be the go-to method for a high-performing algorithm with little tuning [63]. In machine learning, it is a set of supervised learning models with associated learning algorithms that analyses data used for Classification and regression analysis. supervised learning is possible if and only ifdataislabeled. It constructs a hyper lane and a set of hyper lanes which in a high and infinite dimensional space, which can be used for another task like outlier detection. Support vector machine is based on finding the hyper lane that gives the largest minimum distance to the training. It analyses the data after that it classify that data and then the regression is done with having the following advantage and disadvantage. The advantages of support vector machine are: Operative in high dimensional spaces. Good where number of dimensions is larger than the number of samples. Its memory is well-organized. Adaptable. The disadvantages of support vector machines are: If the number of features is much greater than the number of samples, avoid over-fitting. SVMs do not directly provide probability estimations. The proposed methodology in detecting the crop disease in image processing is as below. Step 1: Havingbean crop leaves. Step 2: Pre-process the Image to decrease noise value in considered leaf image. Step 3: Image Segmentation performed using K-means Clustering to cluster the image into leaf affected portion and unaffected one. Step 4: Select the affected Region, if there of Interest from the Segmented Image. Step 5: Feature Extraction is performed by maintaining statistical Parameters of Skewness, Standard_ Deviation, Homogeneity, Contrast, Smoothness, Correlation, Kurtosis, Energy, Entropy, Mean, Variance, RMS, and IDM. 43 Step 6: Use the Support Vector Machine for the Detection of leaf type (diseased and healthy). Step 8: Affirm the Disease type and assess the percentage of disease of that crop leaf. 44 CHAPTER FIVE RESULTS AND DISCUSSIONS 5.1. Introduction Here, we have presented a report for experimental results actioned in testing the effectiveness of our research. Accordingly, the type of classifier, the data set used and the results attained in the classification process have been conversed. Besides these, the discriminative power of color, size, and shape are tested, evaluated and compared with a number of algorithms used in each processing steps. 5.2. Data Set A total of 100 fababean crop leaves are prepared to test the proposed model. Those crop sample constituents are separated into their corresponding 4 classes based on their characteristics. Hence, we finally have 4 outputs each corresponding to each of the classes. The data were partitioned into bacterial blight, alternaria leaf spot and healthy. From the total data sets, 20 are of Bacterial Blight, 20 are of Alternata Alternaria, 20 Halo blght and 20 are for healthy leaf and the rest 20 are settled for testing purpose. For back ground we select the black color in identifying the actual images. The samples of bean plants are positioned directly under the camera for image acquisition. The classifier, SVM, used 80% of the data for training and the rest 20% is used for testing. The main objective of this research is to train the machine in order to predict the disease. Bean crop leaf disease was basically identified by witnessing different patterns on the parts of the crop leaf. The design of the research is constructed with the sub activities:Image collection, Image Preprocessing, Image segmentation, Feature extraction and Classification using SVM.The research result in detecting the fababean diseases is starting with acquisition of images followed by the steps of preprocessing the image and enhancing it, then to segment the image using inverse difference method. Then, the extracted texture features of the bean leaf hasbeen passed to SVM classifier so as to identify the disease. 45 5.3. Testing Techniques on MATLAB Beforehand of testing the projected classification technique on factual medical images which are often times complex, we supposed it is mandatory to test the technique on foreseeable, noise free imitated images. The principal perseverance of using MATLAB generated images for testing the technique was to quickly determine whether the outcome of the test is precise or erroneous based on inputs with noticeable outputs. The opening test is achieved on a MATLAB generated noise free color image which has three colored areas particularly with colors Red, Blue and Green. 5.4. Implementation To implement the algorithm, the "MATLAB" tool is selected. MATLAB has an imageprocessing toolbox, which contains all functions that are used to analyze the image such as reading, enhancement, converting from one image type to another, segmentation, labeling and more. The research settled and implemented with MATLAB r2015b software on Hp, CORE i7 with 8GB RAM personal computer. The technique tested on the different set of datacollected fromsources as stated in the third chapter. Firstly, MATLAB generated artificial color images that are supposed easy for manual classification considered. The technique tested on healthy and disease detected beans taken from the data used in different sites and an institution. In each steps of image processing we have used different algorithms and techniques and we tested those algorithms with respect to detecting bean leaf disease as classified. Firstly, we access the RGB images with clicking “LOAD IMAGES” and select the appropriate image that we want to remove noises or preprocess. In the succeeding stage click ‘ENHANCE CONTRAST’ button to deepen the contrast of the input image. Then the preprocessed images were clustered in to fixed pieces and we have to select one cluster which include the diseased crop leaf among(cluster 1, cluster 2 and cluster 3) to see its features(in our case the clustered image have 13 parameters to classify it to its type of images (Bacterial blight, Alternaria leaf spot, Halo Blight, Healthy).When clicking the “CLASSIFICATION RESULT”, we have got the type of disease the leaf detected or the message “healthy” with the percentage of regions detected (if it is affected). At the end we have checked the Accuracy level of the classification we did for the input images by clicking “Accuracy” button. 46 5.4.1. Stage One: Image acquisition The diseased leaves sample images are collected and are used in training the system. To train and to test the system, diseased leaf images and some healthy images are taken. The images are stored in their captured or preprocessed format. In this research, we took images available in the internet that are infected by alternaria leaf spot, halo blight and bacterial Blight. Image Conversion The system read the images with (a = imread(path);) function and the images have been converted to Gray by the function (b = rgb2gray(a);) because to see the different color features of the images we used as input for training. The input images were also be converted to each of the Red, Gray and Blue color images. RGB color of the images were converted to HSI for higher efficiency in observing those and to see additional features. as shown below. The images must also be resized and changed the background color. Figure 5-1:Conversion of RGB2HSI 5.4.2. Stage two: Image Preprocessing Image pre-processing is substantial for genuine data that are frequently noisy and irregular[64]. In this phase, the transformation is performed to convert the image into another image to improve the quality that better suits for analyzing. Properties like boundaries and edges are better viewed in black images; statistical properties related to intensities are observed in greyscale format, and the information related to color is seen well in RGB, HSI and other color formats of 47 the image. In this system, the imagesare resized to 256x256 and thresholding is done using Otsu’s method which converts the intensity image to binary image. Convert the RGB image format to a gray-scale image is also possible. Input image’s histogram is used to compute the mean of the distribution and then scaled to a normalized value between 0 and 1. The image below is the result of leaf detected with bacterial blight in preprocessing. Figure 5-2:Preprocessing images 48 RGB is converted to Histogram equalization because it usually increases the global contrast of the processing image and it is also useful for the images which are bright or dark. Histogram equalization is a consideration for the image enhancement. It is a traditional approach of image contrast adjustment then the histogram equalization is shown in Figure below. The histogram is a graph showing the number of pixels in an image for each intensity level in the image. Figure 5-3:Conversion of image to R, G and B and Histogram of the R, Gand B 49 Figure 5-4:Histogram equalization 5.4.3. Stage Three: Image segmentation Here, the given image is separated into a similar region based on the features. Larger data sets are put together into clusters of smaller and similar data sets using clustering approach. We have used K-means clustering algorithm in segmenting the given image into three sets as a cluster that contains the diseased part of the leaf. Since we have to consider all of the colors for segmentation, intensities are kept aside for a while and only color information is taken into consideration. In Bean crop leaf image segmentation of K-Means Algorithmclustering the images are segmented as stated below. 1. To assign data points randomly the given data set should be divided into K number of clusters. 2. For each data point, the distance from data point to each cluster is computed using Euclidean distance, which is the distance between two-pixel points and is given as follows: Euclidean Distance=√ ((x1-x2) ² + (y1 -y2)²) where, (x1, y1) & (x2, y2) are two-pixel points (or two data points). 3. The data point which is nearer to the cluster to which it belongs to should be left as it is. 50 4. The data point which is not close to the cluster to which it belongs to should be then shifted to the nearby cluster. 5. Reiterate all the above steps for all data points. 6. Once the clusters are constant, clustering process needs to be immobile. The clusters have their own structures that are identified and calculated to classify the images to the appropriate type with respect to the disease that affect the leaf image. Figure 5-5:Conversion of RGB to L*a*b color Figure 5-6:K-mean clustering 51 5.4.4. Stage Four: Feature extraction The features of the input images must be extracted. To do so instead of choosing the total set of pixels we can choose only which are necessary and satisfactory to describe the whole of the segment. The segmented image is first selected by manual interference. The affected area of the image can be found from calculating the area connecting the components. First, the connected components with 6 neighborhood pixels are found. Later the basic region properties of the input binary images are found. The interest here is only with the area. The affected area is found out. The percent area covered in this segment says about the quality of the result. The histogram of an entity or image provides information about the frequency of occurrence of certain value in the whole of the data/image. It is an important tool for frequency analysis. The co-occurrence takes this analysis to next level wherein the intensity occurrences of two pixels together are noted in the matrix, making the co-occurrence a tremendous tool for analysis.From gray-co-matrix, the features such as Contrast, Correlation, Energy, Homogeneity' are extracted. The features Standard deviation (SD), Mean, Entropy, RMS, variance, Smoothness, Kurtosis, Skewness, IDM, Contrast, Correlation, Energy and Homogeneity have been calculated and used as input to classify the images based on the values of each healthy or diseased bean plants. Trainingdata’sare implemented in such below procedure: 1. Start with images of that are known. 2. Calculate the feature set for each of them and then label. 3. Take the next image as input and calculate features of this one as new input. 4. Implement the binary SVM to multi class SVM procedure. 5. Train SVM using kernel function of choice. The output will contain the SVM structure and information of support vectors, bias value etc. 6. Group the class of the input image. 7. Depending on the outcome species, the label to the next image is given. Add the features set to the database. 8. Steps 3 to 7 are repeated for all the images that are to be used as a database. 9. Testing procedure consists of steps 3 to 6 of the training procedure. The outcome species is the class of the input image. 52 10. To find the accuracy of the system or the SVM, in this case, random set of inputs are chosen for training and testing from the database. GLCM Texture features are extracted from the segmented image. These features create a feature vector which are served as an input for the training of a classification model collaboration with training labels. 5.4.5. Stage Five: Classification The classifier, SVM, makes use of the hyper-plane is called as the conclusion limit between two of the classes. SVM is important in the problems of pattern recognition like texture classification. In high dimensional spaceSVM plots nonlinear input data to the linear data that provides good classification. SVM is used to maximize the marginal distance between different classes. Different kernels are used to divide the classes. It is basically a binary classifier which determines the hyper plane in dividing two classes. The boundary is maximized between the hyper plane and the two classes. Support vectors are the samples that are nearest to the margin which is selected in determining the hyper plane. It is also possible to use Multiclass classification either by using one-to-one or one-to many. The one with the highest output function is determined as the aiming class.The system classifies the disease type of the bean crop by displaying the type of disease and calculating the amount of region that is affected with the disease identified. At last the system allows to calculate the Accuracy of the SVM classification. 53 Figure 5-7:Bean Crop disease detection GUI By using the classifier, SVM, the bean disease detected the status of bean crop leaves based on the parameters calculated by the feature extraction. The result was in the boundary of Bacterial blight, Alternata Alternaria, Halo blight and Healthy leaves. The classifier has selected the inputs from the datasets of each diseased and the healthy leaf. From the total of 100 images about 96 are correctly classified to their classes (bacterial blight, Alternaria leaf spot, halo blight and healthy). The result of the classification process has an average accuracy of 96.77% with error rate of 3.23%.from the total images we portioned the into 20 datasets for the four classes and correctly detected 19, 18, 19 and 18 of Alternaria leaf spot, bacterial blight, Halo blight and Healthy leaf respectively. The average detection rate of diseasesis 92.5%. Table 5-1: Accuracy value for each disease detection (%) No. Types of leaf Accuracy (%) Error rate (%) 1. Detected with Alternaria Leaf spot 95 5 2. Detected with Halo blight 90 10 3. Detected with bacterial blight 95 5 4. Healthy 90 10 54 Accuracy (%) and Error rate Performace Evauation 100 80 60 Accuracy 40 Error 20 0 Alternaria Leaf spot Bacterial blight Halo blight Healthy Bean crop status Figure 5-8:Accuracy and Error rate detection of diseases The general procedure of bean crop leaf diseases detection and classification system was as follow: 1. Read input image. 2. Resize the image of step1. 3. Enhance the contrast of the resized image 4. K-mean clustering operation will be applied 5. Segment images into three sub-features (Cluster 1, 2 and 3). 6. Select the disease affected area from the clusters (step 5). 7. Filter the image by use median filter to filter the image. 8. Feature extraction of images using Gray-Level Co-occurrence Matrix (GLCM). 9. Compute Skewness, Standard Deviation, Homogeneity, Contrast, Smoothness, Correlation, Kurtosis, Energy, Entropy, Mean, Variance, RMS, and IDM. 10. Classify the diseases type using support vector machine. 11. Compute the accuracy. Display calculated accuracy. 55 CHAPTER SIX CONCLUSION AND RECOMMENDATION 6.1. Conclusion Agriculture in Ethiopia is the groundwork of the country's economy, accounting for half of gross domestic product (GDP), 83.9% of exports, and 80% of total employment. Ethiopia's agriculture is inundated by disease, periodic drought, soil degradationcaused by overgrazing, deforestation, high levels of taxation and poor infrastructure (making it difficult and expensive to get goods to market). Yet agriculture is the country's most promising resource with those obstacles. Crops are the one that make the country build up its economy in production. Among the crops bean is the one that are rich in protein. Production is overwhelmingly of a subsistence nature, and a large part of commodity exports are provided by the small agricultural cash-crop sector. Principal crops include coffee, pulses (e.g., beans), oilseeds, cereals, potatoes, sugarcane, and vegetables. But the crops are affected with diseases that make it to minimum and unhealthy production for the country. We have selected bean crop to put our contribution in increasing the countries production by detecting its diseases with image processing and the machine learning algorithm called SVM. In fact, biological pest control has the great advantage in assuring the safety of employee, protecting the environment, and also to reduce cost while increasing quality by processing images. In order to achieve the biological pest controlling mechanism we have to identify the disease in its early stage. Thus, developing an automatic system that identifies the disease of bean in its early stage has no doubt. Accordingly, to identify different bean diseases, we have chosen a digital image processing technique that is a recent research area in computer science. Digital image processing is a means of processing digital images using a digital computer. Every digital image processing application follows some fundamental steps like image acquisition, preprocessing, feature extraction, segmentation and classification. For this research, we have used a digital image processing to develop a method for automatic identification of bean crop diseases with two phases that is training phase and test phase. In the first phase bean crop leaf images are captured. Then, the images are preprocessed in order to remove noises, lightening effects and others. After preprocessing features of it extracted and 56 the extracted images are segmented using Otsu’s method to identify the region of interest and useful features are extracted. For this research we have selected the texture feature and median filter of flower image and extracted using Gabor feature extraction. we have extracted the texture feature of the image and we have represented the texture features using thirteen different statistical data representation techniques. Finally, those thirteen texture features are used to create the knowledge base which is used to train. In the testing phase bean crop leaf images that are different from images that we use in training phase, are captured. Then, like the first phase images are preprocessed, segmented and useful texture features of those images are extracted from the image using the aforementioned technique. To test the classification accuracy of the system an independent data set was used. The data set contains texture features of a normal and diseased bean crop leaf image that are extracted using GLCM. The experimental result shows that bean crop disease classification using texture features are efficient to classify the disease of a bean in its class of disease.In general, identification of bean crop disease can be done automatically using an image processing technique. Using the test data’s, the three class of diseases bacterial blight, halo blight and Alternaria and healthy are identified as 95%, 90%, 95% and 90% respectively, and the classifier overall performance is96.77%. 6.2. Recommendation In Ethiopia no researches have been conducted for bean crop in the identification of disease to support the agricultural sector. Hence, this research work may encourage different researchers to work on this area. Image analysis for the identification of bean crop disease can be further investigated. The work can also be seen in depth and researched by the different structure of bean crop image. The following recommendations are made for further research and improvement. ✓ In this research paper we have built a system that identifies the type of disease that attack the crop, bean. However, the system does not estimate the asperity of the disease identified by the system. Therefore, automatically estimating the asperity of the identified disease can be one research route. ✓ After the crop disease is identified it is good to recommend appropriate treatment. So, automatically recommending the appropriate handling technique for the disease identified will be likewise other research route. 57 ✓ Increase the database for more bean crop disease by using large number of data as training purpose in classification. ✓ ✓ Identify the bean crop disease with machine learning algorithms rather than SVM Maximize the disease detection and classification accuracy of the classifier we have achieved 58 References [1] N. Sneha, S. Thota and R. M. C, "A Comparative Study on Agricultural Crop Disease Detection System," IJTSRD, vol. 2, 2018. [2] FAO, "Statistics of dry bean," 2014. [3] M. Blair, L. Gonzales, P. Kiman and L. Butare, "Genetic diversity, inter-gene pool introgression and nutritional quality of common beans (Phaseolus vulgaris L.) from central Africa," pp. 237-248, 2010. [4] A. Cortes, F. Monserrate, J. Ramírez-Villegas, S. Madriñán and M. Blair, "Drought tolerance in wild plant populations: The case of common beans (Phaseolus vulgaris L.)," 2013. [5] T. Belete and Bastas, "Common Bacterial Blight ( Xanthomonas axonopodis pv. phaseoli) of Beans with Special Focus on Ethiopian Condition," Journal of Plant Pathology & Microbiology, 2017. [6] P. Devaraj, P. A. Megha and P. V. B, ""Early detection of leaf diseases in Beans crop using Image Processing and Mobile Computing techniques"," Advances in Computational Sciences and Technology, vol. 10, 2017. [7] B. Solomon, M. Firew, K. Gemechu and A. Birhanu, "Genetic Progress for Yield and Yield Components and Reaction to bean Anthracnose ( C o ll e t o t r i c h u m li n d e m u t h i a n u m ) of Large-Seeded Food Type Common Bean (Phaseolus vulgaris) Varieties," East African Journal of science, vol. 1, pp. 15-26, 2019. [8] M. M. T. Kajumula, "Evaluation of common bean (Phaseolus vulgaris L.) genotypes for adaptation to low phosphorus.," ISRN Agronomy, 2012. [9] D. L. J. B. Rodríguez, "Major constraints and trends for common bean production and commercialization: Establishing priorities for future research.," Agron Colomb, pp. 423431, 2014. [10] W. Mekuria and M. Ashenafi, "Evaluation of Faba Bean (Vacia faba L.) Varieties for Chocolate Spot (Botrytis fabae L.) Disease Resistance at Bale Zone, Southeastern Ethiopia," Agricultural Research and Technology, 2018. [11] E. A, A. Sa’ed and Anwar, "A Novel Approach to Classify and Detect Bean Diseases based on Image Processing” Computer Engineering Department, Kuwait University, Kuwait," IEEE, 2018. 59 [12] A. Z and R. H, "Detecting diseases in Chilli Plants Using K-Means Segmented Support Vector Machine," ThirdInternational Conference on Imaging, Signal Processing and Communication, 2019. [13] M. A. Devaraj and V. P, "Early detection of leaf diseases in Beans crop using Image Processing and Mobile Computing techniques," Advances in Computational Sciences and Technology, vol. 10, 2017. [14] Y.-y. L, Z. Shi-yu and S. Jia-hui, "Detection of Ginseng Leaf Cicatrices Base on Kmeans Clustering Algorithm”," 10th International Congress on Image and Signal Processing, Bio Medical Engineering and Informatics, 2017. [15] S. A, G. N and Parul, "Detection and classification of plant leaf diseases using image processing techniques: A review," International Journal of Recent Advances in Engineering & Technology (IJRAET), vol. 2, no. 3, pp. 1-7, 2014. [16] K. J, C. R, S. T and K. R, "A review paper on plant disease detection using image processing and neural network approach," Int. Journal of Engineering Sciences & Research Technology (IJESRT), pp. 758-763, 2016. [17] B. Sanjay, N. D and K. P, "Agricultural plant leaf disease detection using image processing," International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering, vol. 2, no. 1, pp. 599-602, 2013.. [18] V. Sujeet and D. Tarun, "A novel approach for the detection of plant diseases," IJCSMC, vol. 5, no. 7, pp. 44-54, 2016. [19] D. Sachin, A. Khirade and P. B, "Plant disease detection using image processing," International Conference on Computing Communication Control and Automation, pp. 768771, 2015. [20] D. Mrunmayee and A. Ingole, "Diagnosis of Pomegranate Plant Diseases using Neural Network," IEEE, 2015. [21] WASHINGTON STATE UNIVERSITY;, "Common Bacterial Blight and Halo Blight, Two Bacterial Diseases of Phytosanitary Significance for Bean Crops in Washington State," WASHINGTON STATE UNIVERSITY EXTENSION FACT SHEET ,FS038E. [22] G. Saradhambal, R. Dhivya and S. R. Latha, "PLANT DISEASE DETECTION AND ITS SOLUTION USING IMAGE CLASSIFICATION," International Journal of Pure and Applied Mathematics, vol. 119, pp. 879-874, 2018. [23] K. W, "Bean Diseases," Cooperative Extension Service University of Kentucky College of Agricu lture, Food and Environment. 60 [24] T. Belete and K. Bastas, "Common Bacterial Blight (Xanthomonas axonopodis pv. phaseoli) of Beans with Special Focus on Ethiopian Condition," Journal of Plant Pathology & Microbiology, 2017. [25] D. Sharath, S. Akhilesh, K. Arun, M. Rohan and C. Prathap, "Image based Plant Disease Detection in Pomegranate Plant for Bacterial Blight," International Conference on ommunication and Signal Processing, April 4-6,2019. [26] P. H, J. E and H. K. S, "Crops Disease Diagnosing Using Image-Based Deep Learning Mechanism," International Conference on Computing and Network Communications (CoCoNet), 2018. [27] O. Min and N. Chi Htun, "Plant Leaf Disease Detection and Classification," International Journal of Research and Engineering, vol. 5(9), pp. 516-523, 2018. [28] S. L, A. Adline, A. L, A. N and K. G, "Disease detection in crops using remote sensing image," IEEE Technological Innovations in ICT for Agriculture and Rural Developement ,IEEE, 2017. [29] S. H and S. K, "Tomato plant disease classification in digital images using classification tree," 2016 International Conference on Communication and Signal Processing(ICCSP), 2016. [30] K. Rajneet, "A Brief Review on Plant Disease Detection using in Image Processing,," IJCSMC, vol. 6, p. 101 – 106, 2017. [31] D. C, A. L and Y. Sandeep, "Overview of Image Processing," International Journal for Research in Applied Science & Engineering Technology, vol. 2, 2014. [32] B. Pranjali and A. Anjali, "SVM Classifier Based Grape Leaf Disease Detection," Conference on Advances in Signal Processing (CASP) Cummins College of Engineering for Women, IEEE, 2016. [33] M. R, Prakash, P. G, G. Saraswathy, K. Ramalakshmi, M. H and K. T, "Detection of Leaf Diseases and Classification using Digital Image Processing," International Conference on Innovations in Information, Embedded and Communication Systems (ICIIECS, IEEE, 2017. [34] P. Jitesh, B. Harshadkumar and K. Vipul, "A Survey on Detection and Classification of Rice Plant Diseases," IEEE, 2016. [35] D. M. Sharath, Akhilesh, A. K. S, M. Rohan and P. C, " Image based Plant Disease Detection in Pomegranate Plant for Bacterial Blight," international Conference on Communication and Signal Processing, IEEEE, 2019. [36] R. Indumathi, S. V. Thejuswini and Swarnareka.R., " LEAF DISEASE DETECTION AND FERTILIZER SUGGESTION," Proceeding of International Conference on System 61 Computation Automation and Networking,IEEE, 2019. [37] S. D, K. A and P. B, "Plant Disease Detection Using Image Processing I," International Conference on Computing Communication Control and Automation, IEEE, pp. 768-771, 2015. [38] M. C. Ghulam and G. Vikrant, "Advance in Image Processing for Detection of Plant Diseases," International Journal of Advanced Research in Computer Science and Software Engineering,, vol. 5, pp. 1090-1093, 2015. [39] S. Megha, C. R. Niveditha, N. SowmyaShree and K. Vidhya, "Image Processing System for Plant Disease Identification by Using FCM-Clustering Technique," International Journal of Advance Research, Ideas and Innovations in Technology, vol. 3, pp. 445-449, 2017. [40] W. Zhaobin, Z. Xu, S. Xiaoguang, W. Hao, Z. Ying, L. Jianpeg and M. Yide, "Petiole detection algorithm based on leaf image," IEEE 28th Canadian Conference on Electrical and Computer Science Engineering, Halifax, Canada, pp. 1430-1434, 2015. [41] G. K and R. K, "Plant Disease Detection Techniques: A Review," International Conference on Automation, Computational and Technology Management (ICACTM) Amity University, IEEE, 2019. [42] N. Trimi and K. Sushma, "Detection of Plant Disease Using Threshold, K- Mean Cluster and ANN Algorithm," 2nd International Conference for Convergence in Technology (I2CT),IEEE, 2017. [43] T. Getahun, "Automatic Flower Disease Identification Using Image processing," thesis submitted to the school of graduate studies of the addis ababa university in partial fulfillment for the degree of masters of science in computer science, 2015. [44] K. K and M. Chetan, "Analaysis of Diseases in Fruits using Image Proccessing Technqiues," International Conference on Trends in Electronics and Informatics (ICEI) ,IEEE, 2017. [45] R. Namrata and R. V, "Diseases Detection of Cotton Leaf Spot using Image Processing and SVM Classifier," Proceedings of the Second International Conference on Intelligent Computing and Control Systems (ICICCS),IEEE, 2018. [46] A. H. B. A and R. Z., "Detecting diseases in Chilli Plants Using K-Means Segmented Support Vector Machine," 3rd International Conference on Imaging, Signal Processing and Communication, IEEE, 2019. [47] S. Gharge and P. Singh, "Image Processing for Soybean Disease Classification and Severity Estimation," Emerging Research in Computing, Information, Communication and Applications,IEEE , pp. 493-500, 2016. 62 [48] J. Singh and H. Kaur, "A Review on: Various Techniques of Plant Leaf Disease Detection," Proceedings of the Second International Conference on Inventive Systems and Control,IEEE, vol. 6, pp. 232-238, 2018. [49] Khirade, D. Sachin and B. P. A, "Plant Disease Detection Using Image Processing," Computing Communication Control and Automation (ICCUBEA), International Conference, IEEE, , 2015. [50] B. Pranjali and A. Anjali, "SVM Classifier Based Grape Leaf Disease Detection C," Conference on Advances in Signal Processing (CASP) Cummins College of Engineering for Women, IEEE, 2016. [51] M. Bhange and H. Hingoliwala, "Smart Farming: Pomegranate Disease Detection Using Image Processing," Second International Symposium on Computer Vision and the Internet, vol. 58, pp. 280-288, 2015. [52] N. Sneha, S. Thota and R. C, "A Comparative Study on Agricultural Crop Disease Detection Systems," IJTSRD, vol. 2, 2018. [53] A. D, K. R, S. J and I. K, "Identification of Plant Disease using Image Processing Technique," International Conference on Communication and Signal Processing,IEEE, 2019. [54] P. S. P. Patil1 and Z. Ms. Rupali S, "Classification of Cotton Leaf Spot Disease Using Support Vector Machine," Ms. Rupali S.Zambre et al Int. Journal of Engineering Research and Applications,Ijera, vol. 4, no. 5(1), pp. 92-97, 2014. [55] G. Nikita, J. Dhruv and A. Sinha, "Prediction Model for Automated Leaf Disease," IEEE, 2018. [56] T. N. T. a. S. Kamlu, " Detection of plant disease using threshold, k-mean cluster and ann algorithm," 2nd International Conference for Convergence in Technology (I2CT), December, 2017. [57] D. L. R. Saradhambal.G, "PLANT DISEASE DETECTION AND ITS SOLUTION USING IMAGE CLASSIFICATION," International Journal of Pure and Applied Mathematics, vol. Volume 119, pp. 879-884, 2018. [58] U. K. J. D. G. T. U. solanki, " a survey on detection of disease and fruit grading," international journal of innovative and emerging, vol. 2, no. 2, 2015. [59] G. Nilay and P. Atul, "A Survey on Diseases Detection and Classification of Agriculture Products using Image Processing and Machine Learning," International Journal of Computer Applications, vol. Volume 180 , January 2018. 63 [60] T. V, P. a and P. P, "Cucumber disease detection using artificial neural network," International Conference on Inventive Computation Technologies (ICICT), January 2017. [61] V. T. a. P. P. Pooja Pawar, " Cucumber disease detection using artificial neural network," International Conference on Inventive Computation Technologies (ICICT) , January, 2017. [62] A. A. S. a. V. Pawar, " Machine learning regression technique for cotton leaf disease detection and controlling using IoT," International conference of Electronics, Communication and Aerospace Technology (ICECA) , April, 2017. [63] D. Aarju and N. Sumit, "Wheat Leaf Disease Detection Using Machine Learning Method- A Review," International Journal of Computer Science and Mobile Computing, vol. 7, no. 5, pp. 124-129, May 2018. [64] T. Suman and T. Dhruvakumar, "Classification of paddy leaf diseases using shape and color features I," JEEE, vol. 07, no. 01, pp. 339-250, 2015. [65] T. Getahun, "Automatic Flower Disease Identification Using Image processing," Thesis submitted to the school of graduate studies of the addis ababa university in partial fulfillment for the degree of masters of science in computer science, 2015. 64