A Deep Learning Approach to Universal Skin Disease Classification Haofu Liao

advertisement
CSC 400 - Graduate Problem Seminar - Project Report
A Deep Learning Approach to Universal Skin Disease Classification
Haofu Liao
University of Rochester
Department of Computer Science
haofu.liao@rochester.edu
Abstract
Skin diseases are very common in people’s daily life.
Each year, millions of people in American are affected by
all kinds of skin disorders. Diagnosis of skin diseases sometimes requires a high-level of expertise due to the variety of
their visual aspects. As human judgment are often subjective and hardly reproducible, to achieve a more objective
and reliable diagnosis, a computer aided diagnostic system
should be considered. In this paper, we investigate the feasibility of constructing a universal skin disease diagnosis
system using deep convolutional neural network (CNN). We
train the CNN architecture using the 23,000 skin disease images from the Dermnet dataset and test its performance with
both the Dermnet and OLE, another skin disease dataset,
images. Our system can achieve as high as 73.1% Top-1
accuracy and 91.0% Top-5 accuracy when testing on the
Dermnet dataset. For the test on the OLE dataset, Top-1
and Top-5 accuracies are 31.1% and 69.5%. We show that
these accuracies can be further improved if more training
images are used.
1. Introduction
Skin diseases are one of the most commonly seen infections among people. Due to the disfigurement and associated hardships, skin disorders cause lots of trouble to the
sufferers [13]. Speaking of skin cancer, the facts and figures become more serious. In United States, skin cancer
is the most common form of cancer. According to a 2012
statistics study, over 5.4 million cases of nonmelanoma skin
cancer, including basal cell carcinoma and squamous cell
carcinoma, are treated among more than 3.3 million people
in America [20]. In each year, the number of new cases of
skin cancer is more than the number of the new incidence of
cancers of the breast, prostate, lung and colon in combined
[24]. Research also shows that in the course of a lifetime,
one-fifth of Americans will develop a skin cancer [19].
However, the diagnosis of skin disease is challenging.
To diagnose a skin disease, a variety of visual clues may be
used such as the individual lesional morphology, the body
site distribution, color, scaling and arrangement of lesions.
When the individual components are analyzed separately,
the recognition process can be quite complex [6, 15]. For
example, the well studied skin cancer, melanoma, has four
four major clinical diagnosis methods: ABCD rules, pattern
analysis, Menzies method and 7-Point Checklist. To use
these methods and achieve a good diagnostic accuracy, a
high level of expertise is required as the differentiation of
skin lesions need a great deal of experience [30].
Unlike the diagnosis by human experts which depends
a lot on subjective judgment and is hardly reproducible
[16, 17, 26], a computer aided diagnostic system is more objective and reliable. By using well-crafted feature extraction
algorithms and combining with some popular classifiers
(e.g. SVM and ANN), current state of art computer aided
diagnostic systems [2, 31, 11, 22, 3] can achieve very good
performance on certain skin cancers such as melanoma. But
they are unable to perform diagnosis over broader classes of
skin diseases.
Human engineered feature extraction is not suitable for
an universal skin disease classification system. On one
hand, hand-crafted features are usually dedicated for one
or limited number of skin diseases. They can hardly be
applied to other classes and datasets. One the other hand,
due to the diversity nature of skin diseases [6], human engineering for every skin disease is unrealistic. One way to
solve this problem is to use feature learning [4] which eliminates the need for feature engineering and lets the machine
to decide which feature to use. Many feature learning based
classification systems have been proposed in past few years
[5, 8, 7, 28, 29, 1]. However, they have been mostly restricted to dermoscopy or histopathology images. And they
mainly focus on the detection of mitosis, an indicator of
cancer [28].
In recent years, deep convolutional neural networks
(CNN) become very popular in feature learning and object classification. The use of high performance GPU
makes it possible to train a network on a large-scale dataset
so as to yield a better performance. Many researches
[9, 23, 12, 27, 25] from the ImageNet Large Scale VisualRecognition Challenge (ILSVRC) [21] show that the stateof-art CNN architectures are able to surpass human in object classification. Recently, Esteva et al. [10] proposed
a CNN-based universal skin disease classification system.
They train their network by fine-tuning the VGG16 and
VGG19 architecture [25]. Their network achieved 60.0%
Top-1 classification and 80.3% Top-3 classification which
significantly outperformed the human specialists in their experiment.
Inspired by [10], we conducted a research in a similar path and achieved a better accuracy. We use the skin
images from two different sources. First, we collect images from Dermnet (www.dermnet.com), a publicly available dataset of more than 23000 dermatologist-curated skin
disease images. Second, we also obtain 1300 skin images
from New York State Department of Health. We call this
dataset OLE. We fine tune our CNNs with the VGG16,
VGG19 and GoogleNet [27] models and test them on the
two datasets. The result shows that our CNNs can achieve
73.1% Top-1 classification and 91.0% Top-5 classification
on the Dermnet dataset and 31.1% Top-1 classification and
69.5% Top-5 classification on the OLE dataset.
The rest of this report is organized as follows. Section
2 introduces the dataset and CNN models we use to construct the CNN architecture. Section 3 investigates the performance of the CNNs using different training and test data
settings. Conclusions and future works of this project is
given in Section 4.
(a) Dermnet
250
Class size
200
150
100
50
0
0
1
2
5
10
11
13
Class label
14
16
18
19
22
(b) OLE
Figure 1: (a) The number of the Dermnet images in each of
the 23 top-level classes. (b) The number of the OLE images
in each of the skin disease in Table 2. The x-axis denotes the
label of the classes. Each of the label is associated with a
skin disease class in Table 1. The bars in red are the classes
we will delete in Section 3.2.
skin disease images from Dermnet. Since there are no direct link or API for these images, we download these images by parsing their address and send HTTP request to
the web server. The downloaded images are not well labeled and contain watermarks. Their naming format are not
consistent. Therefore, multiple name analysis strategies are
used to extract class information from image names. Further, a two-level hierarchical map is established using the
extracted class information and the Dermnet taxonomy. All
the images from Dermnet are originally labeled using the
bottom-level classes. Since we only consider the 23 toplevel classes, we merge each of these bottom-level classes
according to the two-level hierarchical map and label each
image with the 23 top-level classes. During the construction
of the dataset, if the names of some images contain no class
information or the number of the images of some bottomlevel class is small, we will discard those images. Figure 1
(a) shows the numbers of images in each of the 23 top-level
skin disease classes. For each images in the OLE dataset,
we first parse their file names to get their corresponding skin
diseases. Then, we label them using the disease-label map
in Table 2. The number of images of each skin disease in
OLE dataset are given in Figure 1 (b).
2. Method
2.1. Dataset
We build our skin disease dataset from two different
sources: Dermnet and OLE. Dermnet is one of the largest
photo dermatology source that available publicly. It has
more than 23,000 skin disease images on a wide variety of
skin conditions. Dermnet organizes the skin diseases biologically in a two-level taxonomy. The bottom-level contains more than 600 skin diseases in a fine-grained granularity. The top-level contains 23 skin disease classes as shown
in Table 1. Each of the top-level skin disease class contains
a subcollection of the bottom-level skin diseases. We adapt
the skin disease taxonomy from Dermnet for our classification system and use the 23 top-level skin disease classes
to label all the skin disease images. OLE dataset contains
more than 1300 skin disease images from New York State
Department of Health. It contains 19 skin diseases each
can map to one of the bottom-level skin diseases from the
Dermnet taxonomy. Hence, we further label these 19 skin
diseases with their top-level counterparts in the Dermnet
taxonomy. The labeling results are shown in Table 2.
To prepare the dataset, we need to download the 23,000
2.2. CNN Architecture
It has been shown that in many cases transfer learning
can be used to efficiently train a deep CNN [18, 32]. In
transfer learning, instead of training the network from randomly initialized parameters, people takes a pretrained net2
0.
1.
2.
3.
4.
5.
Acne and Rosacea
Malignant Lesions
Atopic Dermatitis
Bullous Disease
Bacterial Infections
Eczema
Top-level Skin Disease Categories From Dermnet
6. Exanthems & Drug Eruptions 12. Nail Diseases
7. Hair Diseases
13. Contact Dermatitis
8. STDs
14. Psoriasis & Lichen Planus
9. Pigmentation Disorders
15. Infestations & Bites
10. Connective Tissue diseases
16. Benign Tumors
11. Melanoma, Nevi & Moles
17. Systemic Disease
18.
19.
20.
21.
22.
Fungal Infections
Urticaria
Vascular Tumors
Vasculitis
Viral Infections
Table 1: The 23 top-level categories of the Dermnet taxonomy. We use these top-level categories to label images from
Dermnet and OLE. Each of the category is assigned to a numeric label ranges from 0 to 22. Due to the layout limitation, long
category names are shortened.
Subclass Names
Rosacea
Actinic Keratosissis
Basal Cell Carcinoma
Squamous Cell Carcinoma
Atopic Dermatitis
Verruca
Nummular Eczema
Lupus Erythematosus
Melanoma
Melanocytic Nevus
Contact Dermatitis
Lichen Planus
Pityriasis Rosea
Psoriasis
Seborrheic Keratosis
Tinea Corporis
Tinea Versicolor
Urticaria
Herpes
Labels
0
disease datasets and conduct all the trainings and tests on an
NVIDIA Titan Black GPU to accelerate the computation.
Fine-tuning with Caffe requires modifying the network
definition (the deploy protocol or the train validation protocol) of the pretrained models. First, since we need to train
the network using our skin disease datasets, the data layer of
the pretrained network should be reinitialized. The following is an example of the data layer we used for our datasets.
1
2
5
10
1
11
2
13
4
3
5
6
14
7
8
16
9
10
18
11
19
22
13
12
14
15
16
Table 2: The 19 skin diseases from the OLE dataset. We
label them according to the Dermnet taxonomy.
17
18
19
20
layer {
name : ” d a t a ”
t y p e : ” ImageData ”
top : ” data ”
top : ” l a b e l s ”
include {
p h a s e : TRAIN
}
transform param {
mirror : true
c r o p s i z e : 224
m e a n f i l e : ” imagenet mean . b i n a r y p r o t o ”
}
image data param {
source : ” t r a i n . t x t ”
b a t c h s i z e : 32
n e w h e i g h t : 256
n e w w i d t h : 256
}
}
In the data layer, we specify the input dataset using the
“source” parameter which is the name of a text file with
each line giving an image filename and a label. We set
the input image size and the number of images to process at a time using the “new height”, “new width”, and
“batch size” parameters. We also define the preprocessing to the input images using “mirror”, “crop size” and
“mean file” parameters. They are set the same values as
the pretrained models so that the input to the fine-tuned network is at the same scale as the pretrained network. For
the last fully connected layer (the one that outputs scores of
the skin disease classes), we change the output number to
23 for Dermnet’s 23 top-level classes. Hence, it will output
scores to indicate the classification of the input skin disease
image. We want this layer to train from scratch so that its
weights can fit our dataset instead of the pretrained model.
work and fine-tunes its weights by continuing the backpropagation. This works for the reason that the output of the
early layers of a well-trained network usually contains some
generic features. Those generic features such as edges,
blobs can be very useful in many tasks. For a new dataset,
those features can be applied directly. In our approach, we
do transfer learning by fine-tuning ImageNet [21] pretrained
models with Caffe [14], a deep learning framework that
supports expressive and efficient deep CNN training. We
choose VGG16, VGG19 and GoogleNet as our pretrained
models. All of them are trained based on the ImageNet
dataset and proved to have very good performance in the
ILSVRC-2014 competition. VGG16 and VGG19 placed
1st in the task 2a challenge of the ILSVRC-2014 competition and GoogleNet is the winner of the task 1b challenge.
In our experiments, we fine-tune these models with our skin
3
So, we change its name to randomize its weights and train
it with a higher learning rate to get a faster convergence.
For deploy protocols, we also replace the softmax layer by
the softmaxwithloss layer so that the loss function can be
applied to the training.
indicates that they don’t get high error rate on some particular classes and In most of the times, the predictions are
correct.
CNN Model
VGG19
VGG19 Improved
3. Experiments
3.1. Test on Dermnet Dataset
Top-1 Accuracy
72.7%
73.1%
71.8%
Top-5 Accuracy
61.7%
69.5%
Table 4: Top-1 and Top-5 accuracies of the CNNs using the
VGG19 model only. All the CNNs are tested on the OLE
images. For VGG19 Improved, the refined dataset is used.
In our first experiment, we train and test the CNNs using
the Dermnet dataset only. We label all the Dermnet images
using the labeling strategies introduced in Section 2.1. After construction, we get a set of 17630 labeled images. We
randomly pick 16630 of them as the training set and 1000
of them as the test set. Then, we fine-tune three CNNs using the three ImageNet pretrained models (VGG16, VGG19
and GoogleNet) respectively. The Top-1 and Top-5 accuracies 1 of the networks are given in Table 3. We can see
that all of the three networks achieved good classification
results. VGG19 achieved relatively better performance on
Top-1 accuracy. It is probably because that it has more layers (19) than other models.
CNN Model
VGG 16
VGG 19
GoogleNet
Top-1 Accuracy
24.8%
31.1%
3.2. Test on OLE Dataset
We then investigate the performance of the CNN on the
OLE dataset. Our training set is the 17630 skin disease images from Dermnet. But for the test set, we choose the OLE
images. Note that in this case, our training set has 1000
more Dermnet images than the last experiment. We need to
retrain the CNN. Since all the three models show a similar
performance and the VGG19 model performs slightly better, we fine-tune the CNN with the VGG19 model only. The
Top-1 and Top-5 accuracies are shown in Table 4. The new
CNN only yields 24.8% Top-1 accuracy and 61.7% Top-5
accuracy on OLE dataset. This is reasonable as the OLE
dataset may have some skin disease images that the CNN
can’t learn from the Dermnet dataset. Figure 6 (a) shows
the confusion matrix of the CNN on the OLE dataset. As
we’ve shown in Table 2, the OLE dataset only contains a
subset of the 23 top-level classes. Hence, lots of the rows
are marked with zeros. Consistent with the accuracy, diagonal elements in the confusion matrix show little confidence.
Based on the observation above, we further analyze the
CNN by selecting some test images from different skin disease classes and retrieving their nearest neighbors in the
training set. We choose the output of the “fc7” layer as the
feature vector. The reason we choose this layer against other
layers is because the “fc7” is the last layer before the final
output layer (the layer that outputs class scores) and should
contain more specific details of the classes in the Dermnet
dataset. Besides, the dimension of the feature vector is 4096
which can hold a great deal of information of the input image. To get the nearest neighbors, we first build a feature
database for all the skin disease images in the training set.
Then, for a given test image, we obtain its feature vector
from the “fc7” layer and calculate the Euclidean distance
between this feature vector and all the other features in the
feature database. The five training images with the top-5
smallest distances to the input test image are chosen to be
the nearest neighbors. Figure 4 and 5 are the image retrieval
results of two test images from the OLE dataset. For each
of the figure, the top left image is the test image and the rest
are the retrieved neighbors. Figure 4 uses a tinea versicolar
Top-5 Accuracy
91.0%
90.9%
90.7%
Table 3: Top-1 and Top-5 accuracies of the CNNs using
different ImageNet pretrained models. All the CNNs are
trained using the Dermnet images only.
Some example images along with their predictions are
given in Figure 2. The ground truth is given at the top
of each image and the Top-5 predictions together with the
probabilities are given below. We can see melanoma, psoriasis, basal cell carcinoma, and systemic disease are predicted correctly with high confidence. The bullous disease
is misclassified with divergent predictions. Viral infections,
and fungal infections only hit Top-5 predictions with very
low confidence. Benign tumors even gets misclassified with
a high probability prediction on cellulitis. We will further
analyze the the misclassified situations Section 3.2.
The confusion matrices of the three networks are given
in Figure 3. The index of each row in the confusion matrix
corresponds to a true label and the indices of the columns
denote the predicted labels. The color of each cell represents the probability of the prediction and the number appears on each cell gives the occurrences of the prediction.
We can see from the three confusion matrices that the high
confidence predictions are located along the diagonal. This
1 The percentage of tests that the ground truth matches the Top-1(Top-5)
prediction(s).
4
Melanoma
Melanoma, Nevi & Moles 0.999959
Benign Tumors 2e-05
Basal Cell Carcinoma 7e-06
Systemic Disease 6e-06
Pigmentation Disorders 2e-06
Basal Cell Carcinoma
Basal Cell Carcinoma 0.999675
Viral Infections 0.000269
Bacterial Infections 3.7e-05
Bullous Disease 7e-06
Hair Diseases 4e-06
Viral Infections
Psoriasis
Systemic Disease 0.994801
Fungal Infections 0.00349
Bullous Disease 0.001062
Viral Infections 0.000172
Nail Diseases 0.000146
Psoriasis 0.99983
Fungal Infections 0.000102
Basal Cell Carcinoma 2.5e-05
Bullous Disease 2.1e-05
Viral Infections 1.4e-05
Bullous Disease
Fungal Infections
Exanthems & Eruptions 0.260488
Infestations & Bites 0.118785
Benign Tumors 0.114126
Vascular Tumors 0.10317
Vasculitis 0.072352
Systemic Disease 0.994801
Fungal Infections 0.00349
Bullous Disease 0.001062
Viral Infections 0.000172
Nail Diseases 0.000146
Benign Tumors
Cellulitis 0.977077
Psoriasis 0.020311
Bullous Disease 0.00186
STDs 0.000383
Contact Dermatitis 0.000119
Systemic Disease
Systemic Disease 0.99983
Fungal Infections 0.000194
Basal Cell Carcinoma 0.0
Bacterial Infections 0.0
Urticaria 0.0
Figure 2: The prediction results output by the fine-tuned GoogleNet network. The label at the top of each image is the ground
truth. The Top-5 predictions and the corresponding probabilities are given at the bottom of each images.
(a) GoogleNet
(b) VGG16
(c) VGG19
Figure 3: The confusion matrices of the VGG16, VGG19 and GoogleNet based CNNs. All the three CNNs are trained and
tested using the Dermnet images only.
image, which is labeled 18 according to Table 2, as the input. We find that most of the retrieved neighbors are labeled
as 18 and have a very similar appearance with the input image. Similarly, Figure 5 also retrieved a set of neighbors
that look very similar with the input, especially the second
image in the first row. However, all the retrieved neighbors
in Figure 5 have a different label with the input image which
indicates a misclassification. It means the CNN only picks
the images with a coarse similarity and doesn’t look into the
details. One of the possible reason is that the CNN doesn’t
have enough images in the training set to learn the details.
For example, in the Figure 5 case, if the CNN contains more
images with a shot from the side for the given disease, it will
get to learn the detailed information on the face instead of
the face itself.
To verify our assumption, we design another experiment.
As the misclassification is probably due to the lack of training images on certain diseases, we refine both the training
set and the test set by removing some skin diseases with inadequate photos. Thus, if the CNN is trained correctly, it
should learn enough details from the refined training set.
For any given test image with a skin disease that is included in the refined training set, the CNN should have a
higher confidence to classify it correctly. The red bars in
Figure 1 denote the classes we removed from the training
set (Dermnet) and the test set (OLE). We remove the skin
diseases that appear in both training set and test set and are
low in image numbers. We then retrain the CNN using the
refined training set. The Top-1 and Top-5 accuracies of the
newly trained CNN on the refined test set are given in Table
5
(a) VGG19
(b) VGG19 Improved
Figure 6: The confusion matrices of the CNNs using the
VGG19 model only. All the CNNs are tested on the OLE
images. For VGG19 Improved, the refined dataset is used.
Figure 4: Image retrieval of correctly classified image. Top
left is the test image. The rest are its neighbors. 14: Psoriasis. 18: Fungal Infections.
curacy (VGG19) and 91.0% Top-5 accuracy (GoogleNet)
when testing on the Dermnet dataset. We further discover
the performance of the CNN architecture when testing on
a different dataset (OLE). We find the classification system
can only achieve 24.8% Top-1 accuracy and 61.7% Top-5
accuracy due to the lack of a broader variance in the training set. We show that by increasing the variance of the training set the Top-1 and Top-5 accuracies can be improved to
31.1% and 69.5%.
In our future research, we hope we can push this work
much further and get a better accuracy. There are several
places we can improve our work. First, since ImageNet
data are not specialized for skin data, the ImageNet pretrained models may not be the best choice for skin disease
classification. Thus, we should train a CNN model from
scratch and test its performance. Second, the Dermnet images are organized using a biological taxonomy which is not
the best choice for computer vision applications. We will
work with a dermatologist to design a visually organized
taxonomy and apply it to our classifier 2 . Third, as the experimental results suggest that more variance in the training
set would lead to a better accuracy, we should increase the
size of our training set. Also note that the images retrieved
by the networks are closely related to the ground truth. We
may need to design a hierarchical classification algorithm
using the retrieved images to improve the accuracy.
Figure 5: Image retrieval of misclassified image. Top left is
the test image. The rest are its neighbors. 14: Psoriasis. 18:
Fungal Infections.
4. The Top-1 accuracy is 31.1% and the Top-5 accuracy is
69.5%. They are 6.3% and 7.8% higher than the previous
experiment respectively. The confusion matrix is shown in
Figure 6 (b). In a pairwise comparison between (a) and (b),
we can find that the confusion matrix of the new CNN has
higher values in its diagonal elements and lower values in
its off-diagonal elements than its old counterpart.
References
[1] J. Arevalo, A. Cruz-Roa, V. Arias, E. Romero, and F. A.
González. An unsupervised feature learning framework for
basal cell carcinoma image analysis. Artificial intelligence
in medicine, 2015.
[2] J. Arroyo and B. Zapirain.
Automated detection of
melanoma in dermoscopic images. In J. Scharcanski and
M. E. Celebi, editors, Computer Vision Techniques for the
Diagnosis of Skin Cancer, Series in BioEngineering, pages
139–192. Springer Berlin Heidelberg, 2014.
4. Conclusion and Future work
We have investigated the feasibility of building an universal skin disease classification system using deep CNN.
We tackle this problem by fine-tuning ImageNet pretrained
models (VGG16, VGG19, GoogleNet) with the Dermnet
dataset. Our experiments show that the current state-ofart CNN models can achieve as high as 73.1% Top-1 ac-
2 This
6
idea is inspired by [10].
[17] H. Pehamberger, A. Steiner, and K. Wolff. In vivo epiluminescence microscopy of pigmented skin lesions. i. pattern
analysis of pigmented skin lesions. Journal of the American
Academy of Dermatology, 17(4):571–583, 1987.
[18] A. S. Razavian, H. Azizpour, J. Sullivan, and S. Carlsson. CNN features off-the-shelf: an astounding baseline for
recognition. CoRR, abs/1403.6382, 2014.
[19] J. K. Robinson. Sun exposure, sun protection, and vitamin
d. Jama, 294(12):1541–1543, 2005.
[20] H. W. Rogers, M. A. Weinstock, S. R. Feldman, and B. M.
Coldiron. Incidence estimate of nonmelanoma skin cancer
(keratinocyte carcinomas) in the us population, 2012. JAMA
dermatology, 2015.
[21] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh,
S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein,
A. C. Berg, and L. Fei-Fei. ImageNet Large Scale Visual
Recognition Challenge. International Journal of Computer
Vision (IJCV), 115(3):211–252, 2015.
[22] A. Sáez, B. Acha, and C. Serrano. Pattern analysis in dermoscopic images. In J. Scharcanski and M. E. Celebi, editors,
Computer Vision Techniques for the Diagnosis of Skin Cancer, Series in BioEngineering, pages 23–48. Springer Berlin
Heidelberg, 2014.
[23] P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus,
and Y. LeCun. Overfeat: Integrated recognition, localization and detection using convolutional networks. CoRR,
abs/1312.6229, 2013.
[24] R. L. Siegel, K. D. Miller, and A. Jemal. Cancer statistics,
2015. CA: a cancer journal for clinicians, 65(1):5–29, 2015.
[25] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. CoRR,
abs/1409.1556, 2014.
[26] A. Steiner, H. Pehamberger, and K. Wolff. Improvement of
the diagnostic accuracy in pigmented skin lesions by epiluminescent light microscopy. Anticancer research, 7(3 Pt
B):433–434, 1986.
[27] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed,
D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich.
Going deeper with convolutions. CoRR, abs/1409.4842,
2014.
[28] H. Wang, A. Cruz-Roa, A. Basavanhally, H. Gilmore,
N. Shih, M. Feldman, J. Tomaszewski, F. Gonzalez, and
A. Madabhushi. Cascaded ensemble of convolutional neural networks and handcrafted features for mitosis detection.
In SPIE Medical Imaging, pages 90410B–90410B. International Society for Optics and Photonics, 2014.
[29] H. Wang, A. Cruz-Roa, A. Basavanhally, H. Gilmore,
N. Shih, M. Feldman, J. Tomaszewski, F. Gonzalez, and
A. Madabhushi. Mitosis detection in breast cancer pathology
images by combining handcrafted and convolutional neural
network features. Journal of Medical Imaging, 1(3):034003–
034003, 2014.
[30] J. D. Whited and J. M. Grichnik. Does this patient have a
mole or a melanoma? Jama, 279(9):696–701, 1998.
[31] F. Xie, Y. Wu, Z. Jiang, and R. Meng. Dermoscopy image processing for chinese. In J. Scharcanski and M. E.
[3] C. Barata, J. Marques, and T. Mendonça. Bag-of-features
classification model for the diagnose of melanoma in dermoscopy images using color and texture descriptors. In
M. Kamel and A. Campilho, editors, Image Analysis and
Recognition, volume 7950 of Lecture Notes in Computer Science, pages 547–555. Springer Berlin Heidelberg, 2013.
[4] Y. Bengio, A. Courville, and P. Vincent. Representation
learning: A review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell., 35(8):1798–1828, Aug. 2013.
[5] H. Chang, Y. Zhou, A. Borowsky, K. Barner, P. Spellman,
and B. Parvin. Stacked predictive sparse decomposition for
classification of histology sections. International Journal of
Computer Vision, 113(1):3–18, 2014.
[6] N. Cox and I. Coulson. Diagnosis of skin disease. Rook’s
Textbook of Dermatology, 7th edn. Oxford: Blackwell Science, 5, 2004.
[7] A. Cruz-Roa, A. Basavanhally, F. González, H. Gilmore,
M. Feldman, S. Ganesan, N. Shih, J. Tomaszewski, and
A. Madabhushi. Automatic detection of invasive ductal carcinoma in whole slide images with convolutional neural networks. In SPIE Medical Imaging, pages 904103–904103.
International Society for Optics and Photonics, 2014.
[8] A. A. Cruz-Roa, J. E. A. Ovalle, A. Madabhushi, and F. A. G.
Osorio. A deep learning architecture for image representation, visual interpretability and automated basal-cell carcinoma cancer detection. In Medical Image Computing and
Computer-Assisted Intervention–MICCAI 2013, pages 403–
410. Springer, 2013.
[9] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. FeiFei. Imagenet: A large-scale hierarchical image database.
In Computer Vision and Pattern Recognition, 2009. CVPR
2009. IEEE Conference on, pages 248–255, June 2009.
[10] A. Esteva, B. Kuprel, and S. Thrun. Deep networks for early
stage skin disease and skin cancer classification.
[11] G. Fabbrocini, V. Vita, S. Cacciapuoti, G. Leo, C. Liguori,
A. Paolillo, A. Pietrosanto, and P. Sommella. Automatic
diagnosis of melanoma based on the 7-point checklist. In
J. Scharcanski and M. E. Celebi, editors, Computer Vision Techniques for the Diagnosis of Skin Cancer, Series in
BioEngineering, pages 71–107. Springer Berlin Heidelberg,
2014.
[12] S. Ioffe and C. Szegedy. Batch normalization: Accelerating
deep network training by reducing internal covariate shift.
CoRR, abs/1502.03167, 2015.
[13] G. K. Jana, A. Gupta, A. Das, R. Tripathy, and P. Sahoo.
Herbal treatment to skin diseases: A global approach. Drug
Invention Today, 2(8):381–384, August 2010.
[14] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell. Caffe: Convolutional architecture for fast feature embedding. arXiv preprint
arXiv:1408.5093, 2014.
[15] C. M. Lawrence and N. H. Cox. Physical signs in dermatology: color atlas and text. Wolfe Publishing (SC), 1993.
[16] A. Masood and A. Ali Al-Jumaily. Computer aided diagnostic support system for skin cancer: A review of techniques
and algorithms. International Journal of Biomedical Imaging, 2013, 2013.
7
Celebi, editors, Computer Vision Techniques for the Diagnosis of Skin Cancer, Series in BioEngineering, pages 109–
137. Springer Berlin Heidelberg, 2014.
[32] J. Yosinski, J. Clune, Y. Bengio, and H. Lipson. How
transferable are features in deep neural networks? CoRR,
abs/1411.1792, 2014.
8
Download