Research Journal of Applied Sciences, Engineering and Technology 4(18): 3215-3221,... ISSN: 2040-7467

advertisement
Research Journal of Applied Sciences, Engineering and Technology 4(18): 3215-3221, 2012
ISSN: 2040-7467
© Maxwell Scientific Organization, 2012
Submitted: December 23, 2011
Accepted: February 22, 2012
Published: September 15, 2012
Performance Evaluation of Discriminant Analysis and Decision Tree, for Weed
Classification of Potato Fields
1
Farshad Vesali, 2Masoud Gharibkhani and 3Mohmmad Hasan Komarizadeh
Department of Agriculture Machinery Engineering, University of Tehran, Karaj, Iran
2
Department of Agricultural Machinery Engineering, Ege University, Bornova, ¤zmir, Turkey
3
Department of Mechanic of Agricultural Machinery Engineering, University of Urmia,
Urmia, Iran
1
Abstract: In present study we tried to recognizing weeds in potato fields to effective use from herbicides. As
we know potato is one of the crops which is cultivated vastly all over the world and it is a major world food
crop that is consumed by over one billion people world over, but it is treated by weed invade, because of row
cropping system applied in potato tillage. Machine vision is used in this research for effective application of
herbicides in field. About 300 color images from 3 potato farms of Qorveh city and 2 farms of Urmia
University-Iran, was acquired. Images were acquired in different illumination condition from morning to
evening in sunny and cloudy days. Because of overlap and shading of plants in farm condition it is hard to use
morphologic parameters. In method used for classifying weeds and potato plants, primary color components
of each plant were extracted and the relation between them was estimated for determining discriminant function
and classifying plants using discrimination analysis. In addition the decision tree method was used to compare
results with discriminant analysis. Three different classifications were applied: first, Classification was applied
to discriminate potato plant from all other weeds (two groups), the rate of correct classification was 76.67%
for discriminant analysis and 83.82% for decision tree; second classification was applied to discriminate potato
plant from separate groups of each weed (6 groups), the rate of correct classification was 87%. And the third,
Classification of potato plant versus weed species one by one. As the weeds were different, the results of
classification were different in this composition. The decision tree in all conditions showed the better result than
discriminant analysis.
Keywords: Color components, potato plant, three composition of classification, weed
INTRODUCTION
The potato (Solanum tuberosum L.) is a major world
food crop. In world food production, potato is exceeded
only by rice, wheat and maize. Potatoes are consumed by
over one billion people world over; half of them are in the
developing countries alone. Potato is the fifth farm
product of the world and third of Iran (FAO, 2010).
Eskandari et al. (2011) investigate some method like hand
weeding; power machine weeding and weeding using
herbicides on rice field and expressed that chemical
treatment were the best treatments because in all sampling
stages, the least dry weight of weed produced. This
appropriate control has an important effect on yield while
the most yield rate belonged to this treatment. Also they
declared, if herbicides use just for weeds (not for soil or
main plant) it will be most efficient. But weeds invading
on potato farm, lead to increasingly use of selective and
unselective herbicides which causes the land and
environmental pollution. On the other hand, high price of
selective herbicides makes them noneconomic for
farmers. So if there is a machine that can spray the
herbicide directly on the weed, the used amount of it will
reduce by 50 to 70%, which leads to money saving and
reduction of environmental pollution. The objectives of
this study were to represent a useful and instant method
for segregating weeds from potato plant, using machine
vision. The study were done to approach this aim is in two
divisions: the first one is to recognize weed between
planting row and second one is to recognize weed on
planting row.
In the first division, all the plants between the
planting row should be removed; no matter if they are
weed or not. Bases on this principal Woebbeck et al.
(1995) simply tried to separate plants from soil
background, But Lee (1999) recognized tomato leaves
from weeds and segregated them using some specifics of
tomato plant and designed an herbicide sprayer.
Recognizing weeds based on texture analysis is a
powerful method to separate smooth surface from shaggy
and non-uniform surface. Scarr (1998) used this method
for separating onion, because the leaves of onion are strait
up and it is not possible to use shape assessment methods.
Using texture analysis method, the problems related to
Corresponding Author: Farshad Vesali, Deparment of Agriculture Machinery Engineering, University of Tehran, Karaj, Iran
3215
Res. J. Appl. Sci. Eng. Technol., 4(18): 3215-3221, 2012
shape assessment and overlapping leaves was almost
solved. El-Faki et al. (2000), compared four index based
on main color components of plants to evaluate the
effective parameters on weed recognition. They surveyed
the effect of some parameters like soil moisture, light
intensity and image resolution on weed recognition
accuracy. They also evaluated the efficient of this method
in field condition. Astrand and Baerveldt (2003) used
some combinations of color and shape features for sugar
beet weed segmentation. They evaluated shape features
single plants and showed that plant recognition based on
color vision is feasible with three feature and a 5-nearest
neighbor classifier. Color features could solely have up to
92% success rate in classification. This rate increased to
96% by adding two shape features. Yang et al. (2000)
developed a data mining technique, decision-tree analysis
and used it to classify multi-spectral data sets from 33
experimental plots containing various combinations of
crop (corn or soybean) and weed species. Their results
indicate that a reasonable degree of differentiation
(0.85±0.06) may be obtained for the most complex of the
classification problems investigated, which was to classify
the 33 plots into 11 plot categories. Also they reported
that classification success may be mainly related to the
input type and may or may not be related to the number of
inputs (Yang et al., 2000). Burgos-Artizzu et al. (2009),
used Case-Based Reasoning (CBR) for choosing the best
pattern for segregating new images based on previous
segmenting experiments and were able to speedup weed
recognition on the planting row suing this method, which
showed increase in Pearson Correlation from 60.1 to
79.7%.
The main goal of this study is reducing usage of
herbicides in potato fields by recognizing the weeds in
potato fields and effective use from herbicides. To
achieve this goal we used image processing method to
detect the weeds from potato plant. After that the ability
of decision trees and discriminant analysis for classifying
weeds and potato plants, was evaluated.
MATERIALS AND METHODS
For image data base of this research, 300 color
images from 3 potato farms of Qorveh city and 2 farms of
Urmia University, was acquired in year 2010. The image
acquisition system consisted of a digital camera (Sony
CyberShot W200, Japon) and the image capture size was
2,048×1,538 pixel in the horizontal and vertical directions
respectively. In most of these images we could see five
species of most important weeds of potato fields:
Convolvulus arvensis (Field Bindweed), Centaurea
arvense, Cirsium arvense (Creeping Thistle),
Chenopodium murale (salt-green or sowbane) and
Chenopodium album (goosefoot). For algorithms
development, image processing tool from MATLAB
version R2009a toolbox was used.
Classification of weeds is influenced by the weeding
time of the farm. In farms that the growth of plant and
weed are concurrent, the weeding process was done in
first stages of plant growth and both plant and weed are
small, the success probability of morphologic method is
high, but for most of row crops like potato because of
plant and weed leaves overlapping, it is almost impossible
to apply morphological methods for classification. The
overlapping of leaves makes it very hard to assign a
specific and uniform shape for plant, in different images.
There are some other factors like leaves deflexion or
deformation of leaves under environmental condition like
plant diseases or herbs invade and leaves shift by wind,
that makes the morphological specifications assessment
very hard. Though in this study color processing and
texture recognition was evaluated and finally these traits
were used for seperating weeds from potato plant by
means of a classifier.
Preprocessing operations: As images were acquired in
field condition, it was necessary to do preprocessing
operations on the images to reduce the effect of light
variation on images. Therefore light intensity and contrast
of images with low mean light intensity, were increased
using coefficient of Gama Correction Function, so light
variation in images were reduced. This effect was obvious
in surveying some images before and after preprocessing
operations.
Separating plant and soil background: At first it seems
that because of the green color of plant, we can easily
separate plant from field using valley of green component
histogram for thresholding, but as the background (soil)
color is not constant and there is always some
differentiation in light intensity in different parts of image
and between images, it is impossible to apply that method.
So we used another method for this job. As the dominant
color of the plant is green, the quantity of green
component (G) should be more than other two
components (R and B), which is explained by Eq. (1):
G
RG
 2G  R  B  0
2
(1)
Using this equation and choosing zero for threshold
quantity, it is possible to separate plant from the soil
(Gonzalez and Woods, 2008). After applying Eq. (1) there
were some noises remaining in images. To remove these
noises, after converting the images to binary image,
Opening and Closing function was applied to them and
the as a result, complete segmentation of plant was
obtained without any noise.
3216
Res. J. Appl. Sci. Eng. Technol., 4(18): 3215-3221, 2012
Table 1: Mean and standard deviation of three main components of
potato plant and five types of weeds
R
G
B
Mean SD
Mean
SD
Mean
SD
Centaurea
136.73 44.03 166.42 45.39
135.05
53.08
arvense
Convolvulus 108.46 37.09 153.53 38.79
119.73
43.84
arvensis
Cirsium
96.12 35.58 142.07 35.73
97.18
38.76
arvense
Chenopodium 112.23 39.98 159.42 39.22
99.24
41.38
murale
Chenopodium 93.74 33.75 144.91 33.65
102.44
38.65
album
Potato
75.31 36.41 124.25 40.84
78.54
38.27
Weeds
105.18 39.19 151.62 38.33
106.44
43.19
Classifying potato plant and weeds: The classification
between potato plant and weeds was done using two
discriminant analysis methods and a decision tree method.
Discriminant analysis uses training data to estimate the
parameters of discriminant functions of the predictor
variables. Discriminant functions determine boundaries in
predictor space between various classes. The resulting
classifier discriminates among the classes (the categorical
levels of the response) based on the predictor data. So
Discriminant Analysis (DA) is a technique used to build
a predictive model of group membership based on
observed characteristics of each case. Discriminant
function is the linear or nonlinear combinations of the
standardized independent variables, which yield the
biggest mean differences between the groups. As
mentioned before El-Faki et al. (2000) used color
component combination, but by surveying color
component indexes for potato plant and weeds, no useful
difference was seen between them and because of
negative quantities of some indexes, it led to wrong
recognition. Though, three primary color components
were used for discriminant analysis input. These values
were chosen from 175 separated images of 300
acquisitions. As each of these components is obtained
from a pixel, the quantity of input data for discriminant
analysis was too much. There for, to reduce the quantity
of imputed data and also to assimilating the effective
chance of each type of weeds, the total amount of them
for potato plant and weeds were reduced to 50,000;
10,000 data for potato and 8,000 data for each weed.
Averages and standard deviation of this data for each
group was showed in Table 1.
Discriminant analysis was used to determine the
membership model of data. This procedure was done by
means of Statistics Toolbox in MATLAB software.
The other method which was used to segregate potato
plant and weeds is decision tree. A Decision Tree (DT) is
a machine learning algorithm based on a sequential divide
and conquers approach (Han and Kamber, 2001; Breiman
et al., 1984). The algorithm builds classification or
regression models in an unambiguous way by recursively
partitioning data. It learns the predictor or response
pattern from the training data and derives a series of
decision rules to appropriately represent the pattern. Each
rule divides the training data into subsets using a
threshold. In most DT algorithms, the data is classified at
a time into two subsets (binary tree). These decision rules
are hierarchical and sequential in nature and can be
presented in a flowchart like, top-down tree structure
(Witten and Frank 2000; Liu and Paulsen, 2000). In this
study, Univariate tree or binary tree was used in each
group. In this type of decision tree which is considered as
simple and quick decision trees, there is just one condition
in each node. In both methods (DA and DT), 80% of data
was chosen randomly for training and remained 20% data
was used for testing the method.
In this study, three different composition of potato against
weeds were considered as below:
C
C
C
Classification of potato and weeds
Classification of potato versus all weeds
Classification of potato versus each weeds species
one by one
In First composition, classification was done between
10,000 potato plants R, G, B (Red, Green, Blue) data and
40,000data from all weed types as one group. In second
composition classification was done between potato plant
and weeds in six group, 10,000 data for potato and 8,000
for each kind of weeds. Finally potato plant was classified
separately comparing with each of five weed types.
RESULTS AND DISCUSSION
For evaluating accuracy of separating plant from soil
background with Eq. (1), 10 images were chosen
randomly and the edge of plants were chosen and then
separated accurately from soil background by hand using
Adobe Photoshop CS4 software. The obtained image of
this separation was subtracted from the image which was
obtained of Eq. (1) in MATLAB. In ideal situation the
result of subtraction should be equal to zero, this results
to a completely black image. But it wouldn’t be possible
and there were some pixels remained. By dividing the
number of these pixels to number of all pixels in image,
the error percent of separated soil background, will be
obtained. This procedure is shown in Fig. 1.
This work was done for other images to and the
mean deviation of field wrong separation in these 10
images was 2. 3%. Low error rate shows that combination
of opening and closing operation in addition to
preprocessing and Eq. (1) can separate soil background
accurately. For comparison El-Faki et al. (2000)
encounter a high error rate (about 15%) in separation soil
3217
Res. J. Appl. Sci. Eng. Technol., 4(18): 3215-3221, 2012
Fig. 1: Comparison between to segregation methods; (A) Segregation using equation No.2; (B) Segregation using Adobe Photoshop
CS4 software; (C) Deviation between to methods
Fig. 2: Part of the decision tree that used to classify the weeds from potato
Table 2: The result of classifying potato and weeds in composition a
Predicted group membership
---------------------------------------------------------Discriminant
Decision
analysis
tree
`
------------------------- ---------------------------Potato% Weeds% Potato%
Weeds%
Potato
153
838
408
583
15.44
84.56
41.17
58.83
Weeds
2995
1014
3608
401
74.71
25.29
90.00
10.00
Correct classification 76.67
83.82
rate
background that maybe because of using Eq. (1) without
any preprocessing before separation. Jafari et al. (2006) to
reduce error in separation due to light change had to split
their images to two group 2 (Images that were in light and
Images that were in shadow) (Jafari et al., 2006; Scar
et al., 1998).
It is obvious that when the potato plant was classified
in condition that all weed types placed in one group
(composition a), classification rate in both methods was
higher than classifying potato plant and each kind of
3218
Res. J. Appl. Sci. Eng. Technol., 4(18): 3215-3221, 2012
(a)
(b)
Fig. 3: 3D graph of discriminant function, classifying potato plant from weeds using R, G, B components amount. (a) 3D view; (b)
View of R-B plate (blue dots stand for weeds and red dots stand for potato plant)
weeds separately (composition b). In composition c,
depending on the type of weeds, different classification
rate was assessed.
Table 2 shows the assumed membership by both
methods in test stage in composition a.
The result of this comparison shows that in this
status, the accuracy of decision tree in classifying potato
plant and weeds, is higher than discriminant analysis. This
decision tree has 1,200 sub branches and because of huge
size of this tree, just part of it is placed in Fig. 2.
Discriminant analysis produces a discriminant
function for each data set and the discriminant function
for this case is as Eq. (2) the discriminant function is in
fact assessed the constants of a plate that classified potato
plant from weeds based on previous training course of
discriminant analysis.
DF(P&W) = ! 0.0530 × R !0.0015 ×G + 0.0014
× B+4.9774
(2)
As mentioned before, the Eq. (2) is equation of a
plate, that if value of each pixel put on this equation, the
negative or positive result value of equation determines
whether this pixel is related to potato or weed.
In Fig. 3, plate of discriminant function (2) with R, G
and B of both groups is shown. Part (B) of this graph is
produced by rotation of 3D graph and it helps to better
understanding of how discriminant function classifies the
3219
Res. J. Appl. Sci. Eng. Technol., 4(18): 3215-3221, 2012
(a)
(b)
(c)
(d)
Fig. 4: Classification of weed from potato plant. (a) Primary image (b) The images after segmenting soil background. (c) Classified
image weeds from potato with discriminant analysis (d) Classified image (weeds from potato) with decision tree (The yellow
color is represent weeds and the blue represent potato plant, this colors chosen arbitrary)
Table 3: The result of classifying potato and weeds in composition b
Predicted group membership (%)
------------------------------------------------------------------------------------------------------------------------------------------Centaurea
Convolvulus
Cirsium
Chenopodium
Chenopodium
Plant
Potato
arvense
arvensis
arvense
murale
album
Potato
DT
60.04
1.92
5.54
11.60
6.86
14.03
DA
65.97
0.81
4.04
10.16
6.99
12.03
Centaurea arvense
DT
1.49
76.58
8.09
4.98
6.22
2.61
DA
2.04
82.45
8.08
4.36
6.71
0.11
Convolvulus arvensis
DT
6.65
9.53
53.07
11.42
5.52
13.80
DA
8.36
13.99
47.08
6.93
5.15
18.5
Cirsium arvense
DT
14.02
5.17
12.30
45.01
8.61
14.88
DA
20.8
7.54
10.88
18.99
18.30
23.50
Chenopodium murale
DT
5.58
3.24
3.37
7.79
74.16
5.84
DA
5.99
10.48
1.89
8.73
67.81
5.11
Chenopodium album
DT
18.76
2.91
18.28
15.38
4.72
39.95
DA
26.94
4.18
20.59
9.64
5.68
32.99
Total CCR
DT
58.12
DA
52.48
data. Number of pixels for the weed was about 4 times
more than pixels for potato plant so as it is shown in
Fig. 2, the number of blue dots is more than red dots.
In composition b, that all weeds and potato plant are
placed in separate classes, the decision tree showed the
best accuracy. The CCR (Correct Classification Rate) was
58% comparing with discriminant analysis which showed
(Correct Classification Rate) of 52% in this composition.
So the high accuracy of decision tree is proofed. More
details of classification with DA and DT in composition
b are placed in Table 3.
Decision tree was able to classify all type of weeds
from potato simultaneity and the results of it was better
than discriminant analysis as showed in Table 3.
In composition C as the weeds were different, the
results of classification were different. Though, by means
of discriminant analysis the highest resut was obtained by
classifying potato plant and Centaurea arvense and this
rate was 95% and the lowest result belonged to
classification between potato plant and Chenopodium
album by 73%. Decision tree resulted to better
classification rates for each weed. The result of
classification using decision tree also showed that the
highest rate belonged to separating potato plant from
Centaurea arvense by 97% and the lowest rate belonged
to classification between potato plant and Chenopodium
album by 78%. Consequently, the decision tree showed
better result for classification between potato plant and
weeds in all cases, which shows the high capability of this
method for classification.
In Fig. 4a, we can see two sample images of weed
and potato plant. After applying Eq. (1) soil background
was segmented (Fig. 4b) and the effect of discriminant
function and decision tree that classified weeds from
potato plant is shown in Fig. 4c and d. As a result weeds
became yellow and potato plant became blue
distinguishable.
CONCLUSION
Gama correction function was used to normalize the
brightness and contract of pictures. Afterward, for
separating soil background, the Eq. (1) was used. This
3220
Res. J. Appl. Sci. Eng. Technol., 4(18): 3215-3221, 2012
equation separated the soil background and plants more
accurately, when it used with opening and closing
function. Ability of two methods to Classify Potato plants
and five types of common potato weeds was evaluated in
three compositions. In these three compositions, the result
of decision tree was better than the discriminant analysis.
For the example when classifying all types of weed as one
group and potato as another group (Table 2), DT
classification was 6% higher than DA (DT: 58.12%; DA:
52.48%). However it is important, which type of weed
control device is considered to be used. For spot spraying
with selective herbicides, where the main objective is to
minimize the herbicide consumption, such classification
rates are desirable. Furthermore when classification was
done between one type of weed and the potato plant, CCR
was high. In many cases, near a potato plant just one or
two types of weed are exist in real condition, so this
procedure would be able to classify weeds and potato
accurately.
REFERENCES
Astrand, B. and A. Baerveldt, 2003. An agricultural
mobile robot with vision-based perception for
mechanical weed control. Auton. Robot., 13: 21-35.
Breiman, L., J.H. Friedman, R.A. Olshen and C.J. Stone,
1984. Stone Classification and Regression Trees.
Belmont, Wadsworth, Cal.
Burgos-Artizzu, X.P., A. Ribeiro, A. Tellaeche,
G. Pajares and C. Fernández-Quintanilla, 2009.
Improving weed pressure assessment using digital
images from an experience-based reasoning
approach. Comput. Electr. Agric., 65: 1324-1333.
El-Faki, M.S., N. Zhang and D.E. Peterson, 2000. Factors
affecting color-based weed detection. Trans. ASAE,
43(2): 1001-1009.
Eskandari, F.C., H. Bahrami and A. Asakereh, 2011.
Evaluation of traditional, mechanical and chemical
weed control methods in rice fields. AJCS, 5(8):
1007-1013.
FAO, 2010. Statistic Food and Agriculture Organization
of the United Nations. Retrieved from:
http://www.fao.org/corp/statistics/en/.
Gonzalez, R.C. and R.E. Woods, 2008. Digital Image
Processing. Prentice Hall, New Jersey-Pearson.
Han, J. and M. Kamber, 2001. Data Mining: Concepts
and Techniques. Academic Press, San Diego, CA,
USA.
Jafari, A., S.S. Mohtasebi, H.E. Jahromi and M. Omid,
2006. Weed detection in sugar beet fields using
machine vision. Int. J. Agric. Biol., 8(5): 602-605.
Lee, W.S., D.C. Slaughter and D. Geiles, 1999.
Development of a machine vision system for weed
control using precision chemical application. Trans.
ASAE., 39(13): 220-227.
Liu, J. and M.R. Paulsen, 2000. Corn whiteness
measurement and classification using machine vision.
Trans. ASAE, 43(6): 1669-1675.
Scarr, M.R., C.C. Taylor and I.L. Dryden, 1998.
Unsupervised texture segmentation using reversible
jump Markov chain Monte Carlo methodology.
University of Leeds, Statistics Tech. Report STAT.
Witten, I.H. and E. Frank, 2000. Data Mining: Practical
Machine Learning Tools and Techniques with JAVA
Implementations. Morgan Kaufmann Publishers, San
Francisco, CA, USA.
Woebbeck, D.M., G.E. Meyer, K. Von Bargen and
D.A. Mortensen, 1995. Shape features for identifying
young weeds using image analysis. Trans. ASAE, 38:
271-281.
Yang, C.C., S.O. Prasher, J.A. Landry, H.S. Ramaswamy
and A. DiTommaso, 2000. Application of artificial
neural networks in image recognition and
classification of crop and weeds. Can. Agric. Eng.,
42(3): 147-152.
3221
Download