Uploaded by Hussam Sarfraz

CNN Image Detection with MNIST Dataset

advertisement
5/27/23, 12:26 PM
211668_CI_Project_Resnet1 (1).ipynb - Colaboratory
1.PNG
Convolutional Neural Network (CNN) for Image Detection
Abstract
Deep Learning algorithms are designed in such a way that they mimic the function of the human cerebral cortex. These algorithms are
representations of deep neural networks i.e. neural networks with many hidden layers. Convolutional neural networks are deep learning
algorithms that can train large datasets with millions of parameters, in form of 2D images as input and convolve it with filters to produce the
desired outputs. In this article, CNN models are built to evaluate its performance on image detection datasets. The algorithm is implemented on
MNIST and its performance is evaluated.
INTRODUCTION
Image detection is a classic machine learning problem. It is a very challenging task to detect an object or to recognize an image from a digital
image or a video. Image detection has application in the various field of computer vision, some of which include facial recognition, biometric
systems, self-driving cars, emotion detection, image restoration, robotics and many more. Deep Learning algorithms have achieved great
progress in the field of computer vision. Deep Learning is an implementation of the artificial neural networks with multiple hidden layers to
mimic the functions of the human cerebral cortex. The layers of deep neural network extract multiple features and hence provide multiple levels
of abstraction. As compared to shallow networks, this cannot extract or work on multiple features. Convolutional neural networks is a powerful
deep learning algorithm capable of dealing with millions of parameters and saving the computational cost by inputting a 2D image and
convolving it with filters/kernel and producing output volumes.
The MNIST dataset is a dataset containing handwritten digits and tests the performance of a classification algorithm. Handwritten digit
recognition has many applications such as OCR (optical character recognition), signature verification, interpretation and manipulation of texts
and many more. Handwritten digit recognition is an image classification and recognition problem and there have been recent advancements in
this field.
MNIST is the dataset used for image recognition i.e. for recognition of handwritten digits. The dataset has 70,000 images to train and test the
model. The training and test set distribution is 60,000 train images and 10,000 test images. The size of each image is 28x28 pixels (784 pixels)
which are given as input to the system and has 10 output class labels from (0-9). Fig.1 shows a sample picture from MNIST dataset.
2.PNG
Implementation details
The implementation is done in foure catagories with changing the kernel sizes, adding dropouts and batch normalization, varying the hidden
layers and padding type. The specs are below:
1. Kernel_size (3,3) without adding dropout and batch normalization.
2. Kernel_size (5,5) , max_pooling with adding dropout and batch normalization.
3. Kernel_size (2,2) and strides=(2,2),max_pooling, padding = "same" with adding dropout and batch normalization
4. Kernel_size (7,7),max_pooling, padding = "valid" with adding dropout and batch normalization
Importing libraries
from __future__ import print_function
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K
Downloading the dataset.
batch_size = 128
num_classes = 10
https://colab.research.google.com/drive/1Ysj9o5tyJs1DphrPMNRCsysLnYPJI-a-#scrollTo=i0mPdqCdZc00&printMode=true
1/4
5/27/23, 12:26 PM
211668_CI_Project_Resnet1 (1).ipynb - Colaboratory
epochs = 10
# input image dimensions
img_rows, img_cols = 28, 28
# the data, split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()
if K.image_data_format() == 'channels_first':
x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
input_shape = (1, img_rows, img_cols)
else:
x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
11490434/11490434 [==============================] - 2s 0us/step
Channels First. Image data is represented in a three-dimensional array where the first channel represents the color channels, e.g. [channels]
[rows][cols].
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255 #normalizing
x_test /= 255 #normalizing
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')
# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 test samples
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import time
# this function is used to update the plots for each epoch and error
def plt_dynamic(x, vy, ty, ax, colors=['b']):
ax.plot(x, vy, 'b', label="Validation Loss")
ax.plot(x, ty, 'r', label="Train Loss")
plt.legend()
plt.grid()
fig.canvas.draw()
import tensorflow as tf
from tensorflow.keras.applications import ResNet50
from tensorflow.keras import layers
# Load the pre-trained ResNet model
resnet = ResNet50(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
# Create your own classification model
model = tf.keras.Sequential()
model.add(layers.Conv2D(3, (1, 1), input_shape=(28, 28, 1)))
model.add(layers.Rescaling(1./255))
model.add(layers.Conv2D(3, (3, 3), padding='same'))
model.add(layers.UpSampling2D((8, 8)))
model.add(resnet)
model.add(layers.GlobalAveragePooling2D())
model.add(layers.Dense(256, activation='relu'))
model.add(layers.Dense(num_classes, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])
https://colab.research.google.com/drive/1Ysj9o5tyJs1DphrPMNRCsysLnYPJI-a-#scrollTo=i0mPdqCdZc00&printMode=true
2/4
5/27/23, 12:26 PM
211668_CI_Project_Resnet1 (1).ipynb - Colaboratory
history = model.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/resnet/resnet50_weights_tf_dim_ordering_tf_kernels_no
94765736/94765736 [==============================] - 6s 0us/step
Epoch 1/10
469/469 [==============================] - 726s 1s/step - loss: 1.3734 - accuracy: 0.6362 - val_loss: 3.2879 - val_accuracy: 0.1135
Epoch 2/10
469/469 [==============================] - 693s 1s/step - loss: 0.3971 - accuracy: 0.9366 - val_loss: 3.4410 - val_accuracy: 0.1135
Epoch 3/10
469/469 [==============================] - 693s 1s/step - loss: 0.1865 - accuracy: 0.9631 - val_loss: 2.9572 - val_accuracy: 0.1237
Epoch 4/10
469/469 [==============================] - 683s 1s/step - loss: 0.1204 - accuracy: 0.9742 - val_loss: 1.4257 - val_accuracy: 0.5896
Epoch 5/10
469/469 [==============================] - 693s 1s/step - loss: 0.0901 - accuracy: 0.9794 - val_loss: 0.0811 - val_accuracy: 0.9811
Epoch 6/10
469/469 [==============================] - 693s 1s/step - loss: 0.0714 - accuracy: 0.9832 - val_loss: 0.0659 - val_accuracy: 0.9831
Epoch 7/10
469/469 [==============================] - 692s 1s/step - loss: 0.0593 - accuracy: 0.9861 - val_loss: 0.0575 - val_accuracy: 0.9851
Epoch 8/10
469/469 [==============================] - 695s 1s/step - loss: 0.0504 - accuracy: 0.9882 - val_loss: 0.0517 - val_accuracy: 0.9865
Epoch 9/10
469/469 [==============================] - 693s 1s/step - loss: 0.0434 - accuracy: 0.9895 - val_loss: 0.0475 - val_accuracy: 0.9874
Epoch 10/10
469/469 [==============================] - 681s 1s/step - loss: 0.0383 - accuracy: 0.9909 - val_loss: 0.0442 - val_accuracy: 0.9877
Test loss: 0.044158075004816055
Test accuracy: 0.9876999855041504
import matplotlib.pyplot as plt
%matplotlib inline
print('Test score:', score[0])
print('Test accuracy:', score[1])
fig,ax = plt.subplots(1,1)
ax.set_xlabel('epoch') ; ax.set_ylabel('Categorical Crossentropy Loss')
# list of epoch numbers
x = list(range(1,epochs+1))
vy = history.history['val_loss']
ty = history.history['loss']
plt_dynamic(x, vy, ty, ax)
Test score: 0.044158075004816055
Test accuracy: 0.9876999855041504
https://colab.research.google.com/drive/1Ysj9o5tyJs1DphrPMNRCsysLnYPJI-a-#scrollTo=i0mPdqCdZc00&printMode=true
3/4
5/27/23, 12:26 PM
211668_CI_Project_Resnet1 (1).ipynb - Colaboratory
check 0s
completed at 12:24 PM
https://colab.research.google.com/drive/1Ysj9o5tyJs1DphrPMNRCsysLnYPJI-a-#scrollTo=i0mPdqCdZc00&printMode=true
4/4
Download