Uploaded by Runding Wang

CNN Group project

CS3244 Group Project Report - Group 33
Written by: Leong Jia Wei, Marcus(A0183502M), Chua Hua Ren(A0189749B),
Ong Ming Chung(A0189174R), Lin Mei An(A0189040H), Poh Lin Wei(A0190339B), Wang Runding(A0182884N)
1. Introduction
Children are often curious about their surroundings (Honig
et al. 2003). In their attempts to better understand their environment (Stephens 2007), many of them often ask questions
(Singh 2014). However, when the questions are regarding
the biodiversity around us, parents in Singapore may not be
able to appropriately respond to these questions as they are
often disconnected from the natural environment (Chan et
al. 2015). When children are unable to elicit an answer after
posing a question, they may be discouraged from questioning in the future. This can have a detrimental effect on their
learning experience as curiosity is essential for learning (Engel 2013).
To mitigate this, we propose a mobile application to encourage the exploration of nature amongst children. They
can take pictures of animals and plants that they are curious
about, and the application will display information about the
species identified. Through gamifying the process, we can
further incentivise them to discover more species. However,
due to the complexity of such a model and the limited time
we had, we had to narrow the scope of our application as
a proof of concept. We chose to focus on butterflies since
almost half of Singapore’s native butterfly species have become extinct over the past 160 years (Tan 2020). Hence, it is
important for children to gain more appreciation of the local
butterfly species around them before they vanish.
The rest of the paper is structured as follows: In section
2, we discuss a few relevant works; in section 3, we explain
how we obtained our dataset for this project; in section 4, we
elaborate on our proposed methods; in section 5, we present
our experiment and results; in section 6, we discuss about
our application and how it is customized for our target audience; and in section 7, we conclude our report.
2. Related Works
Zhu et al. (2019) developed a similar application. In their
work, they experimented with two methods: training a fourlayer convolutional neural network (CNN) from scratch, and
tuning a pre-trained VGG19 model (Simonyan et al. 2015).
While the latter approach provided better results, they felt
that the improvement was not sufficiently significant to justify the use of the more complex pre-trained CNN. Their
final model can identify butterflies of 10 different species
with an accuracy of 98.53%. Unfortunately, as they trained
Copyright c 2020, Association for the Advancement of Artificial
Intelligence (www.aaai.org). All rights reserved.
their model using the Leeds Butterfly Dataset (Wang et al.
2009), which consists of many butterfly species that are not
found in Singapore (National Parks 2019), the application
may be less relevant to children in Singapore. Furthermore,
since the application does not specifically cater to the needs
of children, it may not appeal to them.
Peng et al. (2017) also worked on a related problem: finegrained classification, which involves the identification of
objects that belong to the same category but different subcategories. For instance, they trained their model to classify
different bird species. In their work, they proposed the use
of object-part attention model (OPAM), which is based on
CNN. The model first identifies and crops the object of interest. Then, it identifies the crucial parts of the object, and
classifies the object. The novelty of their method lies in the
fact that their model is able to perform well despite being
trained on data with minimal annotations. However, since
their model incorporates four different CNNs that are based
on VGGNet, a rather dense network, the time taken for prediction may be longer than desired. As such, direct application of their approach might not be ideal.
3. Data Collection Methods
To obtain the butterfly images, we scraped images of 5 different butterfly species that are commonly found in Singapore (National Parks, 2019) from the image hosting service
Flickr. To do so, we used an API provided by Flickr and
Python’s flickrapi package.
Thereafter, we manually cleaned the scrapped images. A
few images that we scrapped were problematic because they
were either irrelevant butterfly species or irrelevant images.
To avoid complications during the training of our model,
we removed these noisy images. In addition, as Krause et
al. (2018) found that a balanced dataset is important for the
trained models to work well, we settled with 300 images per
species, with a total of 1500 images in our final dataset.
4. Proposed Methods
In this section, we elaborate on our proposed method.
4.1. Overview
We tuned pre-trained CNNs for the purpose of butterfly
species identification. Furthermore, inspired by Peng et al.’s
work (2017), we attempted to improve the prediction accuracy by detecting the butterflies in the image and cropping
out their background before passing them as inputs into the
CNN. To detect the butterflies, we used a neural network
to determine their locations. In addition, since the butterfly
might move while the child takes a picture of it, the image
taken might be blurred. To reduce the possible resulting inaccurate predictions, we deblurred images provided by the
children before the images were cropped and predictions
were made.
4.2. Identification of Butterfly Species
We chose to explore the use of CNN for butterfly species
identification because it has shown promising results for image identification tasks (Simonyan et al. 2015) and even finegrained classification tasks (Peng et al. 2017). Besides, it has
been found that as compared to handcrafted features, CNNs
are often better able to capture mid-level representations that
are useful for classification tasks (Oquab et al. 2014).
Unfortunately, due to the large number of parameters
present in CNNs, training a CNN often requires a large
dataset, which we lack. Therefore, we decided to perform
transfer learning, which involves the use of information from
one domain to improve the performance of a learner that is
in a different but related domain (Weiss et al. 2016). Specifically, motivated by Zhu et al.’s (2019) findings, we first experimented with using VGG19 (Simonyan et al. 2015) as our
base model. Thereafter, we tried using MobileNetV2 (Sandler et al. 2019) because the storage space required by a
VGG19 model may render it infeasible for predictions to be
done locally, on mobile phones.
4.3. Object Detection
Since the butterflies in the images may be very small, we hypothesise that cropping out of the images’ background and
focusing on the butterflies can reduce noise. Besides, since
we hope to differentiate the different butterfly species, small
differences in the image can significantly affect the resulting
output (Peng et al. 2017), which further justifies the cropping of images.
We chose YOLOv3, the latest variant of a popular object
detection algorithm You Only Look Once (YOLO), to detect
the butterflies. It is able to detect multiple butterflies in the
image. The output of the detection model is four coordinates
which demarcates the butterfly’s location. After obtaining
these coordinates, we crop out the rest of the image.
The mean Average Precision(mAP) with an Intersection
over Union(IoU) threshold is a widely used metric to evaluate object detection models. IoU is a measure of the prediction’s precision by comparing the overlap between the
true object area and predicted object area. Whenever the IoU
value is above the threshold, it is considered a correct prediction. On the other hand, COCOs average mean (AP) metric
is computed with stepped IoU thresholds, from 0.5 to 0.95,
with a step size of 0.05. Models that achieve high precision
score well in this metric. For YOLOv3, its COCO AP performance is behind other models like RetinaNet in this metric but is faster. Furthermore, using the mAP at IoU = 0.5
metric, YOLOv3 still performs very well: it is almost comparable with RetinaNet. This illustrates that YOLOv3 performs sufficiently well at detecting objects (Redmon 2018).
We chose YOLOv3 for its speed and reasonable accuracy.
4.4. Deblurring Images
Image blurring is a central issue in computer vision as blurriness results in noise which could lead to misclassifications in the trained model (Dodge et al. 2016). Accounting for blurriness and noise in the model is thus vital, especially when our user takes pictures of butterflies, where
there could be motion blur due to camera shake or the butterflies’ movements. A potential solution is to include blurred
images in our dataset. However this could lead to lower performance levels on high quality images (Kupyn et al. 2019).
Hence, we decided to deblur user-submitted images: after
the user uploads a image, we run the images through a pretrained model that deblurs these images. We chose to use
DeblurGAN-v2 to deblur the images mainly due to its efficiency. Specifically, while its performance is slightly behind Scale Recurrent Networks (SRN), it takes 78% less inference time, with a time of 0.35s (Kupyn et al. 2019). We
believe that efficiency is especially important since our target audience is children and they tend to be more impatient
(Stutter et al. 2013).
Existing CNNs for image deblurring works by using
multi-stream CNNs which process the input image at multiple scales, resulting in semantically strong features at all
resolution levels but lead to a considerable increase in inference time. In comparison, DeblurGAN-v2 uses the lighter
weight Feature Pyramid Network (FPN) (Tsung-Yi et al.)
which comprises of a bottom-up pathway that passes the image to a convolutional network to extract the features in a
feature map layer; and finally passes the feature map layer
to the top-down pathway to improve resolution of the feature map layer and localize objects. While this may result in
lower accuracy, the inference time for FPN’s are considerably lower (Kupyn et al. 2019).
5. Experiment and Results
In this section, we analyse how the CNN (with transfer
learning) performs with different base models, and compare them with our baseline model, a support vector machine (SVM). Then, we evaluate the effectiveness of using
object detection and deblurring techniques in improving the
model’s prediction accuracy.
5.1 Baseline Model: Support Vector Machine
The baseline model that we chose to compare with our proposed method is SVM, which is known for its ability to be
trained easily. Besides, there is no local optimal, unlike in
neural networks, and it copes relatively well with high dimensional data.
Data For each butterfly species, we wrote a script to shuffle the images and randomly select 200 images for training,
50 images for validation and 50 images for testing. Following Ripley’s recommendation (1996), we used the images
in the validation set to tune our model’s hyperparameters,
and the images in the testing set for a final evaluation of our
As pointed out by Wang et al. (2017), simple data augmentation techniques can improve the accuracy of a model’s
prediction. Thus, we randomly rotated, flipped and added
noise to the images used for training. These were done using scikit-image.
Implementation We used the default linear function
found in sklearn.svm.SVC as the model’s kernel function.
Next, we resized each image to 128 by 128 pixels, and extracted the hog and colour features of these images. Since
the dimension of the features is too large, we applied the
PCA technique to reduce it. These vectors were then used as
the SVM’s input.
Results The result is surprisingly satisfactory for the classification of butterflies from 5 species: the average rate of
true positive is 84.2%. In addition, since our application displays the top 3 possible options, the metric of top-3 accuracy
is also measured. In this case, the accuracy stands at 99.4%.
5.2 Proposed Model: Convolutional Neural
Data We separated and processed the data as described
in section 5.1. However, image augmentation was done using Keras’ ImageDataGenerator class. In addition to the
augmentation techniques described previously, we also randomly zoomed, sheared and shifted the images. Furthermore, given the importance of preprocessing in image classification tasks (Kuntal et al. 2016), we also preprocessed the
images during training, validation and testing. The preprocessing done is the same as that used by the base models’
authors to train their models. For this, we used the implementation provided in Keras Application library.
Network Architecture When applying transfer learning,
the choice of which layers (of the base model) to fine-tune
can significantly affect the model’s eventual performance
(Ghazi et al. 2017): fine-tuning the later layers of a network can be helpful as the weights of these layers tend to
be more problem-specific; however, fine-tuning too many
layers when data is scarce can result in overfitting. Besides
selecting the layers to fine-tune, we needed to decide which
base model’s layers to remove, and the number of adaptation
layers to add.
As we did not have prior experience with such tasks,
we decided to follow Oquab et al.’s approach (2014),
which gave promising results for their transfer learning task.
Specifically, we froze the layers of the base model, removed
the output layer, and added one hidden and one output layer
with ReLU and softmax activation functions respectively.
The resulting model architectures examined are shown in
figure 1.
The weights and architecture of the pre-trained models
used were obtained from Keras Applications.
After attaining satisfactory results (refer to table 1), we
tried to remove the two fully-connected layers that are
present in VGG19 to get V2 (refer to figure 1). We hypothesized that the model will be able to perform as well, if not
better, because our classification task only involves 5 categories (as opposed to that of the source task, in which there
are 1000 categories) and hence we might not need such a
complex model (Bebis et al. 1994). Since there are no fullyconnected layers in MobileNet and we found that fine-tuning
the convolutional layers gives undesirable results, we tried to
prune the network by using one adaptation layer instead of
two, which gives us M2 (refer to figure 1).
Batch size Similar to Zhu et al. (2019), we found that using a smaller batch size of 32 during training gave better results. For instance, when training the model, which is based
on MobileNet, with a batch size of 32, we get a top-1 accuracy of 77.8% after 120 iterations (which is before the convergence of its accuracy). On the other hand, when training
the same model with a batch size of 64, we get a top-1 accuracy of 75.4% after the same number of iterations. A plausible reason for this is that stochastic behavior is observed
when a smaller batch size is used. Hence, the risk of descending into a local minimum (instead of global minimum)
is reduced (Zhu et al. 2019) which improves the model’s performance. Since using smaller batch size gives better results
for the same number of iterations and requires less training
time, we decided to use a batch size of 32 for the rest of our
Optimizer and Loss Function Initially, we tried using the
Adaptive Moment Estimation (Adam) optimizer in the earlier training stages and switching to stochastic gradient descent (SGD) with low learning rates in the later stages because Keskar et al. (2017) found that doing so can give better results. However, as our models performed relatively well
(refer to table 1) even without the use of SGD, the improvements observed, after this modification was made, were minimal. Hence, we chose to only use Adam as our optimizer
throughout the rest of our experiments. We tuned the optimizer’s parameters by observing their effects on our validation set. Eventually, we found that using those presented in
Kingma et al.’s work (2015) worked best.
Since we are interested in classifying butterflies into one
of the five species, we used categorical cross entropy as our
objective function.
Results The accuracy of the models are as follows.
Top-1 (%)
Top-3 (%)
Table 1: Accuracy of CNN models
The results obtained are highly satisfactory. The overall
efficacy of transfer learning in this case could be attributed
to the presence of overlap between the source and target
tasks. Specifically, the pre-trained CNNs were trained on
ImageNet, which includes the categorisation of butterfly images. Although the butterfly images that are of interest in the
source task differ from ours, the close similarity in tasks can
improve our models’ performance (Oquab et al. 2014).
It is interesting to note that, contrary to our hypothesis,
M2 did not perform as well as M1 . This could imply that
the resulting increase in expression power by adding a fullyconnected layer is helpful in this case. However, by doing
so, the model’s size also increases. Since we hope to keep
the model lightweight, we did not experiment with adding
more layers to it.
Figure 1: V1 , the CNN with VGG19 as its base model; and M1 , the CNN with MobileNet as its base model; V2 and M2 are the
pruned versions of V1 and M1 respectively
5.3 Comparison of Models
As seen from table 2, while all models give reasonably satisfactory results, V2 outperforms the rest.
Top-1 (%)
Top-3 (%)
Size (MB)
Table 2: Comparison of selected models
Although SVM performed satisfactorily on 5 classes, its
prediction accuracy dropped drastically when the number
of classes increases. Specifically, using the data collection
method described in section 3, we obtained images of butterflies from 4 other species. Then, we trained the model to
identify the butterfly species from these 9 species. The average rate of true positive and top-3 accuracy fell to 43.4%
and 71% respectively. This shows that the model is not very
scalable. On the contrary, the top-1 and top-3 accuracy of
V2 on this larger dataset remains at 95.8% and 97.6% respectively. Since we hope to scale up the application in the
future and our CNN model has shown the potential to do so,
we decided to go with V2 .
While V2 performed well in terms of accuracy, its size
is huge compared to M1 . Considering that this is a mobile
application and the device may have limited memory space
and computing power, we hope to exploit M1 ’s advantage in
our application as well (to be discussed in section 6.3.). As
for the SVM model, although its size is quite small, its size
can increase more rapidly than M1 ’s, when the number of
classes increases. This is because SVM adopts a one-vs-rest
approach which trains N classifiers for N classes. Thus, it is
not scalable and thus unsuitable to fulfil our objectives.
5.4 Object Detection + CNN
Object Detection During the training of the object detection model, we tuned a pre-trained YOLO model to leverage
on transfer learning. We used packages from imageai where
a download of the pre-trained model is available on their
Github repository. In order to train the model, we needed
annotated images. We randomly picked about 500 images
from the 5 butterfly species in our dataset, and annotated the
images in Pascal VOC format, using the LabelImg tool. The
trained model’s mAP was 0.9090 with the condition that IoU
is greater than 0.5.
Data To examine whether the use of object detection improves the accuracy of our model, we duplicated the images
in our training, testing and validation sets, and cropped them
using our trained object detection model described above.
Results After training V2 and M1 on these cropped images
and testing them with the cropped images found in the testing set, the top-1 accuracy improved by 1.6% and 4.5% respectively. Furthermore, M1 ’s top-3 accuracy improved by
2.9%. This illustrates the efficacy of this enhancement.
5.6 Object Detection + Deblurring + CNN
Data To test the effects of object detection with deblurring, we first chose 50 images from the test set, 10 images
from each species and blurred the images using Photoshop
motion blur. The control set is the blurred images with object
detection and the experimental set is the deblurred images,
produced by DeblurGAN-v2, with object detection.
Results Using V2 , the control set resulted in an accuracy
of 94.4% while the experimental set resulted in an accuracy
of 98.2% showing that the integration of both features is actually successful in increasing the accuracy of the model.
However, it is important to note that the control set yielded
a lower accuracy than the general data set used on the
original CNN model which had not been passed through
DeblurGAN-v2. This might raise the question of why it is
necessary to use DeblurGAN-v2 in our application. We believe that despite the slight decrease in accuracy that might
be due to effects from DeblurGAN-v2, we should still use it
as the images we tested the model with were retrieved from
photography sources such as Flickr that largely contain clear
images of butterflies. However, in reality, the images taken
by the children might be blurred. Hence, the resulting predictions can be better, if the images gotten from users are
passed through DeblurGAN-v2 first.
Another point to note is that though DeblurGAN-v2 is
able to produce highly restored images, it will still lack in
quality as compared to the high resolution and sharply focused images online, hence also explaining the slight drop
in accuracy.
Figure 2: An infographic of our mobile application, ButterSnapped.
6. Application
In this section, we will discuss about our proposed application ButterSnapped.
6.1. Objectives
Our target audience are children around the age of 5. Keeping this in mind, we proceeded to design our mobile application, ButterSnapped, that caters to their needs and preferences.
6.2. UI Design and Application Features
A key takeaway from the research of Children-Computer Interaction (CCI), a research area in Human-Computer Interaction (HCI), is that the first impressions a child has of an
application are crucial to its effectiveness (Majeed 2017).
This motivated us to make a splash screen which is bright,
eye-catching and colourful (refer to figure 2).
In addition, the research in CCI revealed that too many application functions will confuse the children (Majeed 2017).
Hence, we kept our app design to a bare minimum, consisting only of 3 options: “Let’s Go Hunting!”, “Butter-DEX”
and “Catch of the Day”.
CCI research has also shown that children love a good
challenge and that this challenge would retain their attention
for longer spans of time (Junell 2019). Keeping this in mind,
we designed ButterSnapped as a gamified platform, where
the user gets to snap pictures of butterflies and as a result get
rewarded with various levels of experience according to how
rare the butterfly in the image they took is. Going forward,
the children can then compare their “Butter-DEX” with their
peers. This can in turn motivate them to continue catching
and thus learn more about other species. With this UI design,
we hope to conceal the complexity of our machine learning
algorithms, and only output what the user wants to see.
6.3 Technical Details
Ideally, the predictions will be done using V2 as it correctly
identifies butterfly species more often than M1 . However,
given that V2 is a dense network, it is not possible to run
it locally on mobile devices. Hence, for V2 to be used, the
images taken have to be sent to a server for processing.
Unfortunately, when the children are using the application,
they might be at remote areas in Singapore where the internet connection may be poor, rendering it impossible for the
images to be sent to a server. Thus, we propose using M1 ,
which is a lightweight model that can run locally on mobile
devices, when no or only poor internet connection is available. However, in other situations, the users’ input will be
sent to a server to be processed by V2 .
In addition, since the users might upload an image with
multiple butterflies (that are possibly of different species),
it may be difficult to provide them with information about
each butterfly. Therefore, we will use the YOLOv3 model to
detect the butterflies in the image and request the users to
select which butterfly they would like to learn more about.
Upon clicking on the butterfly, the trained CNN will then
predict the species of the selected butterfly.
7. Conclusion
The aim of our project is to create a mobile application that
can address children’s biodiversity-related questions, which
their parents might lack the expertise to answer.
Our model is able to classify, with high accuracy, five
of the most commonly found species of butterflies in Singapore. Our proposition is to create a mobile application
with VGG-19 and MobileNet. In addition to this, we integrated 2 other techniques: object detection (YOLOv3) and
image deblurring (DeblurGan-v2) in the preprocessing steps
of the user-provided images. These enhancements enable
our model to better spot subtle differences amongst different butterfly species, resulting in better classification performance. Furthermore, as a result of our choices of model, enhancements and implementation, our application is able to
respond relatively quickly to user inputs, and is able to run
offline, if necessary.
The properties of the resultant model enable ButterSnapped to be catered specifically to children and bridge
some gaps in their learning that may be difficult for some
parents in Singapore to address.
Nonetheless, our project has a few limitations. For instance, in cases whereby there are multiple butterflies in
the image or whereby the butterflies are not part of the 5
species we trained our model on, the detection model is unable to locate them well. This is due to the lack of data to
train the object detection model: creating the dataset requires
manual annotation of the images, a painstaking and timeconsuming task which prevented us from increasing the size
of our dataset.
In the future, we can increase our dataset in size and
even to include more species of plants and animals of interest to children. Alternatively, we can explore using semisupervised learning to overcome this challenge.
Collection of Dataset (Hua Ren, Marcus)
Training of SVM (Run Ding, Hua Ren)
Training of CNN (Lin Wei, Marcus)
Object Detection (Run Ding, Lin Wei)
Deblurring of Images (Ming Chung, Mei An)
UI Design (Hua Ren, Marcus)
Audrey Tan. 2020. ”Nearly half of Singapore’s butterfly
species are extinct: Study”. The Straits Times, Science and
Bebis G.; and Georgiopoulos M. 1994. Feed-forward neural
networks. IEEE Potentials 13(4): 27-31.
Chan L.; Goh L.; Lai S.; and Zhou B. 2015. “Community in
Nature”: Reconnecting Singapore’s Urbanites with Nature.
The Nature of Cities
Dodge S.; and Karam L. 2016. Understanding How Image
Quality Affects Deep Neural Networks. Arizona State University
Engel S. 2013. The Case for Curiosity Educational Leadership 70(5): 36-40.
Ghazi M. M.; Yanikoglu B.; and Aptoula E. 2017. Plant
identification using deep neural networks via optimization
of transfer learning parameters. Neurocomputing 235: 228235.
Honig A. S.; Miller S. A.; and Church E. B. 2003. Ages
Stages: How Curiosity Leads to Learning. Scholastic Early
Childhood Today
Jakkula V. Tutorial on Support Vector Machine. Northeastern University CS5100 Resources
Junell, T. 2019. The Definitive Guide to Building Apps for
Kids. Toptal
Kesebir S.; and Kesebir P. 2017. How Modern Life Became
Disconnected from Nature. Greater Good Magazine
Keskar N. S.; and Socher R. 2017. Improving Generalization Performance by Switching from Adam to SGD.
Kingma D. P.; and Lei B. J. 2015. ADAM: A METHOD
Kuntal K. P.; and Sudeep K. S. 2016. Preprocessing for image classification by convolutional neural networks. 2016
IEEE International Conference on Recent Trends in Electronics, Information Communication Technology (RTEICT)
Kupyn O.; Martyniuk T.; Wu J.; and Wang Z. 2019.
DeblurGAN-v2: Deblurring (Orders-of-Magnitude) Faster
and Better. arXiv:1908.03826v1
Lin T.; Dollár P.; Girshick R.; He K.; Hariharan B.; and Belongie S. Feature Pyramid Networks for Object Detection.
Maanvi Singh. 2014. What’s Going on Inside the Brain Of
A Curious Child? National Public Radio
Majeed, A. 2017. Designing Apps for Kids – Best Practices.
National Parks. 2019. List of butterfly species present in Singapore. NParks
National Parks. 2019. Common Butterflies of Singapore
Oquab M.; Bottou L.; Laptev I.; and Sivic J. 2014. Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks. 2014 IEEE Conference
on Computer Vision and Pattern Recognition 1717-1724.
Peng Y.; He X.; and Zhao J. 2017. Object-Part Attention Model for Fine-grained Image Classification.
Redmon J.; and Ali Farhadi A. 2018. YOLOv3: An Incremental Improvement. arxiv:1804.02767
Ripley B. 1996. Pattern Recognition and Neural Networks.
Sandler M.; Howard A.; Zhu M.; Zhmoginov A.; and LiangChieh C. 2019. MobileNetV2: Inverted Residuals and Linear
Bottlenecks. arXiv:1801.04381
Simonyan K.; and Zisserman Y. 2015. Very Deep Convolutional Networks for Large-Scale Image Recognition.
Stephens K. 2007. Curiosity and Wonder: Cue Into Children’s Inborn - Motivation to Learn. Parenting Exchange
Stutter M.; Kocher M. G.; Glätzle-Rützler D.; and Trautmann S. T. 2013. Impatience and Uncertainty: Experimental Decisions Predict Adolescents’ Field Behavior. American
Economic Association
Tsung-Yi L.; Dollar P.; Girshick R; Kaiming H; Hariharan
B. ; Belonge S. 2016. Feature Pyramid Networks for Object
Detection. arXiv:1612.03144
Wang J.; and Perez L. 2017. The Effectiveness of Data
Augmentation in Image Classification using Deep Learning.
Wang J.; Markert K.; and Everingham M. 2009. Learning
Models for Object Recognition from Natural Language Descriptions. Proceedings of the 20th British Machine Vision
Conference (BMVC2009)
Weiss, K.; Khoshgoftaar, T.M.; and Wang, D. 2016. A survey of transfer learning. J Big Data 3, 9.
Zhu L.; and Spachos P. 2019. Butterfly Classification with
Machine Learning Methodologies for an Android Application. 2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP) 1-5.