Uploaded by Annette Dizon

related tinginan

advertisement
2.2 Review of Related Studies
2.2.1 Local Studies
Smart Farm: Automated Classifying and Grading System of
Tomatoes using Fuzzy Logic
This research study by Lenard Dorado aimed to build a computer
vision system that classify and grades tomatoes. This uses image
processing and fuzzy logic in MATLAB software. Their proposed work is
one of the modern ways of farming (smart farm). Tomatoes will be used
as variable for experimentation of the automated classifying and grading
system. There is a series of process in their project. At first, they captured
the images of the tomato and detected the feature using image processing
technique. Their system, the fuzzy logic, will determine if the tomato is
good or bad. After classifying each tomato, those good ones will be grade
based on its level of ripeness. Furthermore, the possibilities of errors will
be reduced through the automated classifying and grading system. This
research study is limited only to the demonstration of the accuracy and
functionality of the system. They tested the accuracy and functionality of
the system by comparing the manual method, which is done by the human,
to the result done by the system. Their system is limited on the classifying
and not on the hardware part, but they recommend it on the further
development of the research. In the latter part of their paper, they
emphasized more on the explanation of the fuzzy logic and the use of
MATLAB, how they were used in their project. (Dorado, Aguila and Caldo,
2016)
9
2.2.2
Foreign Studies
Classification of Green Coffee Bean Images Based on Defect Types
using Convolutional Neural Network (CNN)
This study of Carlito Pinto, Junya Furukawa, Hidekazu Fukai, and
Satoshi Tamuraaimed aims to develop a system that automatically detects
the defect of green coffee beans in Timor-Leste for their production of
coffee. As their initial step, they developed an image processing system
wherein it classifies each image of beans based on the type of defect. For
the development of the classifier, they used the deep convolutional neural
network. The classier succeeded from the accuracy of 72.4% to 98.7%
based on the defect type of green coffee beans. The input of the system
is the colored pictures of the green coffee beans and the output is the
classification of the defect. They labelled the green coffee beans into 6
classes: black, sour, fade, pea berry, damaged, and normal bean. The
inputs of the neural network were the data values from the images on the
dataset. They placed the green coffee beans into a white paper before
they took the photograph of these green beans. They also used a digital
camera in automatic mode with the settings: F/16, ISO 200, 1/60 s
exposure time, auto focus and placed 1m above the beans. After they took
the picture of the front side of the beans, they took also the picture of the
back side. With the use of the some image processing techniques, they
isolate the beans. They perform scaling and resize each image in 256x256
pixels, and they label them manually. They formed three set of images
from the prepared pictures. These are the training set, validation set and
10
test set. The training set was used for the learning of neural networks. In
the learning phase of neural networks, validation set was used to check
and test the accuracy achieved during training. In evaluating the
performance of sorting ability of the neural networks with final parameters
the test data was used. This research trained their neural network from
scratch, wherein they designed their own neural network. They used
different functions and steps to build their convolutional neural network..
(Pinto et al., 2017)
Method of Coffee Bean Defect Detection
The purpose of the study “Method of Coffee Bean Defect Detection”
by Betelihem Mesfin Ayitenfsu is to detect the defect of a coffee bean using
image processing. He said that with the help of image processing
technique, they can outscore the limit of the human capabilities in
inspection and grading of the quality of the green coffee beans. He mainly
focused on the size of the green beans and the broken one. In this study,
they use machine vision and image processing technique to analyze and
grade the coffee bean based on the parameters such as metric value
depend on the area and parameter of coffee bean. An algorithm was
presented to measure key parameters: area and metric value of coffee
bean. In order to detect the defective one, those key parameters are
compared to the model parameter. A digital camera model DSC-H10,
SONY 8.1 Mega Pixel, was used to capture the images of the coffee beans.
They provided a stand for the camera to easily move with respect to the
view of the beans. The main goal of the image processing is to detect the
roundness, are and parameter using bwboundaries, a boundary tracing
11
routine in MATLAB. They use the MATLAB as their platform to perform this
image processing. As an output of the project, the image processing can
classify the sample into 2 criteria. From 100 sample of coffee bean 78.32%
good and 19.68% of coffee bean damaged and 2% wrong detection.
(Ayitenfsu, 2014)
Transfer learning using Convolutional Neural Networks for Object
Classification within X-ray Baggage Security Imagery
The work of Akcay et al, “Transfer learning using Convolutional
Neural Networks for Object Classification within X-ray Baggage Security
Imagery”, is another deep Convolutional Neural Network (CNN) project,
but this time it was done through transfer of learning. This is used to do
the object classification within x-ray baggage security imagery. They used
the transfer of learning instead of the traditional way which requires a large
amount of training data. CNN with transfer learning achieves superior
performance compared to prior work. They make use of CNN configuration
which won the ILSVRC-2012 competition (AlexNet) having 5 convolutional
layers and 3 fully-connected layers, with 60 million parameters, 650,000
neurons. It was trained over ImageNet dataset. They also employ the
ILSVRC-2014 winner (GoogleNet), it has many more layers (22) and 12
times fewer network parameters compared to AlexNet. They used the
Fine-Tuning Approach to the networks using propagation algorithm with
stochastic gradient descent method. They freeze the parameters of certain
layers to use in learning new dataset instead of updating them during
training. They set the classification into two set: a.) 2 classes (guns vs no
guns). b.) 6 classes (firearm, firearm-components, knives, ceramic knives,
12
camera, and laptop). They trained their dataset in varying freeze layers,
e.g. Freeze layer 1, Freeze layer 1 and 2, Freeze layer 1, 2 and 3, etc. and
evaluated each result to find which setup has the highest accuracy. (Akcay
et al., 2016)
Fine Tuning CNNs with Scarce Training Data Adapting ImageNet to
Art Epoch Classification
The objective of this study is to transfer of learning to overcome a
problem in limited training data. They performed transfer of learning to
create a system that classifies some type of paintings. The researches had
a limited amount of data because their topic or their main focus in on
paintings. They used the images available in the websites. To be specific,
they used the Wikipaintings collection for their source of data. the
researchers use the winner of ILSVRC-2012 (AlexNet) which is already
trained in ImageNet dataset. The pre-trained CNN model AlexNet remained
trained in ImageNet and then fine-tuned in their dataset from Wikipaintings
collection. The trained CNN then evaluated and compared to the linear
models based on Improved Fisher Encodings. The classifier can classify
paintings by its art epoch such as Baroque, Renaissance or Impressionism.
(Hentschel, Wiradarma and Sack, 2016)
The Effectiveness of Data Augmentation in Image Classification
using Deep Learning
This study of Jason Wang of Standford University and Luis Perez
of Google evaluated solutions in image classification using data
13
augmentation. Cropping, rotating and flipping images were the traditional
way of data augmentation techniques which were formed and
experimented by different works in the past. They formed small subsets
from ImageNet to perform data augmentation technique and evaluated it.
They said that one of the successful data augmentation techniques was
the traditional way mentioned before. They experiment the use of GANs
(Generative Adversarial Networks) to produce images of different look. A
method was proposed to let a neural network learn augmentations for a
better form of classifier. They call this as neural augmentation. The
researchers limit their data into two classes and build those neural network
classifiers to correctly recognize the class in order evaluate the
effectiveness of augmentation techniques. The researchers trained their
small neural network to perform an extraordinary classification. CycleGAN
was used for data augmentation of the images by transferring its features
to a fixed predetermined image such as night and day theme, or winter and
summer. As a final process, they explore and propose a different kind of
augmentation process wherein they connect the two neural networks,
transfers style and classifies. With that way, their neural network learns
augmentations which reduce classification losses. (Wang and Perez, 2017)
A New Image Classification Method Using CNN Transfer Learning and
Web Data Augmentation
This work is done by Dongmei Hana, Qigang Liu, and Weiguo Fan.
They proposed a two-phase method combining CNN transfer learning and
web data augmentation to solve a problem in a limited training data. With
their method, the presentation of the feature in pre-trained neural network
14
can be efficiently transferred to a new target task. They said that their
method was not only reduces the big requirement in a large data, but also
increase the existing training data. These two methods contribute to the
solution in over-fitting of deep CNNs with a small dataset. The method they
proposed is composed of two phases, phase one builds a powerful
classifier using current training data; phase two focuses on augmenting the
dataset with use and help of the classifier developed in first phase. Their
solution was applied to six public small datasets and as a result compared
to the traditional way; this has a higher and better performance. They said
that the results of their experiment prove that their proposed solution will
be the great solution to use when encountering problems in deep CNNs on
small dataset. The result of their study showed that ResNet achieved the
highest accuracy among all the state-of-the-art models using the six small
datasets. (Hana, Liu and Fan, 2017)
Convolutional Neural Network Transfer Learning for Robust Face
Recognition in NAO Humanoid Robot
This study evaluates the two well-known CNN architectures,
AlexNet and VGG-Face, for face recognition task. They apply transfer
learning to the pre-trained networks to perform recognition. Their face
recognition framework requires only one example image per person to
achieve accurate face recognition. Their proposed face recognition
framework was then implemented to the humanoid robot known as NAO to
test the practicality and flexibility of their algorithm and in a practical
environment. The NAO’s low resolution camera and a separate highresolution camera were utilized to obtain the experimental results. This
15
results to the excellent recognition of a new person from a single example
image under varying distance and resolution. They retrained the AlexNet
on the CASIA-WebFace database; this is to perform the transfer learning
in the said architecture. The database consists of half a million face of a
celebrity images in a total of 10575 unique identities. They resized the
images to fit to the input layers of the CNN. The AlexNet was trained using
the stochastic gradient descent (SGD) with initial learning rate of 0.001.
The VGG face remained as it is because it is already trained for face
recognition. The result of their study showed that VGG-face is much
accurate than AlexNet. But this study showed that transfer learning can be
used to accomplish a real-time face recognition task. They concluded also
that the resolution of the image doesn’t have a great impact on the
performance. (Bussey et al., 2017)
A Machine Vision based Pistachio Sorting Using Transferred MidLevel Image Representation of Convolutional Neural Network
This study aimed to build a computer vision system that separates
the open-shell pistachios to those defective ones as well as trashes. The
images of pistachios and some trashes like branches or twigs were fed on
the new model using a support vector classifier. They used the Canon
600D camera to capture the images of pistachios. The images were taken
in four different lighting conditions, and dark background to visualize as it
is located in a conveyor belt. The pistachios were scattered. Each image
contains multiple numbers of objects with a resolution of 5184 x 3456 and
cropped in 400 x 400 RGB image. They produced 1000 unique images.
After performing image augmentation they produced 20000 images.
16
Scaling, rotating, and lighting conditions were the used augmentation
techniques.
They used image segmentation to detect each object
individually. They performed Canny edge detection followed by active
contour fitting. In their study, they used the two winner of ILSVRC: AlexNet
and GoogleNet, to perform transfer learning as a feature extractor. They
used the MATLAB 2017 as their working platform. The Linear support
vector machine was used for classifying the data to the desired output of
the system. As the result, they got 99% of accuracy for the transferred
weights on the GoogleNet and 98% on the AlexNet. (Farazi, Zadeh and
Moradi, 2017)
A Robust Deep-Learning-Based Detector for Real Time Tomato Plant
Diseases and Pest Recognition
This research study aimed to find the more suitable deep learning
architecture combined with deep feature extractors to have an accurate
and faster detection of diseases and pests in tomato plants. The
researchers focused mainly on the identification and recognition of disease
and pests affecting the tomato plants. The use of meta-architecture based
on deep detectors aimed to identify the Region-of-Interest in the image.
The detectors used in this study are Faster Region-based Convolutional
Neural Network, Region-based Fully Convolutional Network and Single
Shot Detector combined with deep feature extractors including VGG net
and Residual Network. All gathered images were captured under different
conditions and scenarios using camera devices with various resolutions.
The dataset used by the researchers consists of about 5,000 images
gathered from the different farms located in Korean Peninsula. The dataset
17
were manually annotated the areas of tomato images having diseases and
pests by marking a bounding box and placing its class. Data augmentation
technique such as flipping, rotation and cropping of image is also used to
increase the number of dataset. The dataset has been divided into 80%
training set, 10% testing and 10% validation set. The system is trained and
tested with an Intel Core I7 with 3.5 GHz processor and two NVidia Geforce
Titan X GPUs. The whole performance of the system has a mean AP of
more than 80% for the best cases. (Alvaro Fuentes, 2017)
2.3 Synthesis
In a local study entitled “Smart Farm: Automated Classifying and Grading
System of Tomatoes using Fuzzy Logic” (Dorado, Aguila and Caldo, 2016), the authors
focused on the classifying and grading the object only and not on developing such a
machine that sorts. Their goal, to classify, is similar to the goal of this paper. These both
papers focused not on developing the sorting machine but developing such system that
classify. The previous study has a series of system: first is to classify the good and bad,
and the other one is to grade each good based on the ripeness. This study of Dorado used
MATLAB and Fuzzy logic while this paper used Python as the programming language and
Deep learning.
The study (Pinto et al., 2017) which is published by IEEE entitled “Classification
of Green coffee bean images based on defect types using convolutional neural
network (CNN)” is similar to this paper for having the similar purpose, the classifying the
green coffee beans. Even though both studies focused on coffee green beans, there were
differences when it comes to the dataset used. The study (Pinto et al., 2017) classify the
18
green coffee beans based on the defect that leads on a 6 output, the normal also included;
this paper focused only on the 2 class: the defective and the normal bean. Our study
mainly focused on Barako Coffee green beans, which are exclusive in tropical countries.
The previous study trained their deep neural network from scratch and successfully
increased the accuracy of the system while this paper is done through transfer of learning.
Some techniques in the previous study were made as our reference such as the
techniques in capturing the images of the green coffee beans to produce a dataset.
The study “Method of Coffee Bean Defect Detection” (Ayitenfsu, 2014), is similar
to this research paper for having a same number of output classes. The both papers didn’t
focus on the specific type of defect but the study of Ayitenfsu focused only on the
roundness and area of the bean. Even though this research paper didn’t focused on the
type of defect, it can detect all types of defect as one class even if some defective bean
has a standard area. This study of Ayitenfsu used MATLAB as the platform in perform
image processing. Relating to this paper, some techniques in capturing images was
monitored and used.
Not relating to the goal of this paper, which is classifying green coffee beans, the
study “Transfer learning using Convolutional Neural Networks for Object
Classification within X-ray Baggage Security Imagery” (Akcay et al., 2016) was use
as reference for transfer of learning and convolutional neural network. They have different
application but the same tools were used. The prior research focused on two models, the
AlexNet and GoogleNet, while on this paper, the top models which perform the highest
among others were evaluated to achieve the best architecture that suits on the task given.
Some techniques in the previous study were used as reference to achieve a greater result
in this paper.
19
The work of (Hentschel, Wiradarma and Sack, 2016) “Fine Tuning CNNs with
Scarce Training Data – Adapting ImageNet to Art Epoch Classification” helped this
study to perform transfer of learning using fine tuning. With the limited source for dataset,
the said work of Hentschel et al greatly contribute on how to improve this research study
very well.
In the research study of Perez and Wang, “The Effectiveness of Data
Augmentation in Image Classification using Deep Learning” (Wang and Perez, 2017),
data augmentation was the most concerned. In order to prevent overfitting, data
augmentation was also done in this paper. The researchers used some data augmentation
techniques cited in the previous study. This can be used to increase the amount of training
dataset needed in this study.
The study of Hana, Liu, and Fan entitled “A New Image Classification Method
Using CNN Transfer Learning and Web Data Augmentation” (Hana, Liu and Fan,
2017) evaluated the state-of-the-art deep CNN models to find out which models can be
used in the task given. The same with this paper, trsaining and evaluating the state-ofthe-art deep CNN models were done. Both studies were done through transfer of learning
and augmentation; the difference were this previous study trained in many dataset while
in this research there is only one dataset, the green coffee beans. The data augmentation
was also different because the study of Hana, Liu, and Fan is web data augmentation
while on ours; augmentation is just a traditional way.
The “Convolutional Neural Network Transfer Learning for Robust Face
Recognition in NAO Humanoid Robot” (Bussey et al., 2017), is related to this research
study because this two evaluated the state-of-the-art AlexNet model in a specific task.
Both studies also do transfer learning to fit on the task given. Even though this study is
not limited to one model, the researchers performed transfer learning in all of this. Unlike
20
the previous work which only performed this task in one model and compared it to the
existing one that was trained in task already. The techniques in performing transfer
learning, such as fine tuning, in the previous study was made as the reference for this
paper to perform also the said method.
The study that sorts pistachio “A Machine Vision based Pistachio Sorting Using
Transferred Mid-Level Image Representation of Convolutional Neural Network” by
(Farazi, Zadeh and Moradi, 2017) performed transfer learning for feature extraction and
SVM for the classification task, while this study aimed to build a classifier using deep
learning or transfer learning to be specific. This study used the pre-trained CNN for both
task. The study (Farazi, Zadeh and Moradi, 2017) is similar to (Ayitenfsu, 2014) and
(Dorado, Aguila and Caldo, 2016) for using the MATLAB as their platform but this paper
is different because the researchers used the Python language. This has much support
and has developing partners unlike MATLAB which is limited to their source only. Image
augmentation techniques like rotating and scaling were used in this paper.
The study entitled “A Robust Deep-Learning-Based Detector for Real-time
Tomato Plant Diseases and Pest Recognition” is related to this research study since it
also used deep meta-architectures such as Faster RCNN and Single Shot Detector
combined with feature extractors including Inception V2, Resnet-50, Mobilenet V1 and
Mobilenet V2. Both studies were implementing a robust deep-learning based detector
using images captured in complex scenarios.
21
Download