5/27/23, 12:26 PM 211668_CI_Project_Resnet1 (1).ipynb - Colaboratory 1.PNG Convolutional Neural Network (CNN) for Image Detection Abstract Deep Learning algorithms are designed in such a way that they mimic the function of the human cerebral cortex. These algorithms are representations of deep neural networks i.e. neural networks with many hidden layers. Convolutional neural networks are deep learning algorithms that can train large datasets with millions of parameters, in form of 2D images as input and convolve it with filters to produce the desired outputs. In this article, CNN models are built to evaluate its performance on image detection datasets. The algorithm is implemented on MNIST and its performance is evaluated. INTRODUCTION Image detection is a classic machine learning problem. It is a very challenging task to detect an object or to recognize an image from a digital image or a video. Image detection has application in the various field of computer vision, some of which include facial recognition, biometric systems, self-driving cars, emotion detection, image restoration, robotics and many more. Deep Learning algorithms have achieved great progress in the field of computer vision. Deep Learning is an implementation of the artificial neural networks with multiple hidden layers to mimic the functions of the human cerebral cortex. The layers of deep neural network extract multiple features and hence provide multiple levels of abstraction. As compared to shallow networks, this cannot extract or work on multiple features. Convolutional neural networks is a powerful deep learning algorithm capable of dealing with millions of parameters and saving the computational cost by inputting a 2D image and convolving it with filters/kernel and producing output volumes. The MNIST dataset is a dataset containing handwritten digits and tests the performance of a classification algorithm. Handwritten digit recognition has many applications such as OCR (optical character recognition), signature verification, interpretation and manipulation of texts and many more. Handwritten digit recognition is an image classification and recognition problem and there have been recent advancements in this field. MNIST is the dataset used for image recognition i.e. for recognition of handwritten digits. The dataset has 70,000 images to train and test the model. The training and test set distribution is 60,000 train images and 10,000 test images. The size of each image is 28x28 pixels (784 pixels) which are given as input to the system and has 10 output class labels from (0-9). Fig.1 shows a sample picture from MNIST dataset. 2.PNG Implementation details The implementation is done in foure catagories with changing the kernel sizes, adding dropouts and batch normalization, varying the hidden layers and padding type. The specs are below: 1. Kernel_size (3,3) without adding dropout and batch normalization. 2. Kernel_size (5,5) , max_pooling with adding dropout and batch normalization. 3. Kernel_size (2,2) and strides=(2,2),max_pooling, padding = "same" with adding dropout and batch normalization 4. Kernel_size (7,7),max_pooling, padding = "valid" with adding dropout and batch normalization Importing libraries from __future__ import print_function import keras from keras.datasets import mnist from keras.models import Sequential from keras.layers import Dense, Dropout, Flatten from keras.layers import Conv2D, MaxPooling2D from keras import backend as K Downloading the dataset. batch_size = 128 num_classes = 10 https://colab.research.google.com/drive/1Ysj9o5tyJs1DphrPMNRCsysLnYPJI-a-#scrollTo=i0mPdqCdZc00&printMode=true 1/4 5/27/23, 12:26 PM 211668_CI_Project_Resnet1 (1).ipynb - Colaboratory epochs = 10 # input image dimensions img_rows, img_cols = 28, 28 # the data, split between train and test sets (x_train, y_train), (x_test, y_test) = mnist.load_data() if K.image_data_format() == 'channels_first': x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols) x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols) input_shape = (1, img_rows, img_cols) else: x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1) x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1) input_shape = (img_rows, img_cols, 1) Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz 11490434/11490434 [==============================] - 2s 0us/step Channels First. Image data is represented in a three-dimensional array where the first channel represents the color channels, e.g. [channels] [rows][cols]. x_train = x_train.astype('float32') x_test = x_test.astype('float32') x_train /= 255 #normalizing x_test /= 255 #normalizing print('x_train shape:', x_train.shape) print(x_train.shape[0], 'train samples') print(x_test.shape[0], 'test samples') # convert class vectors to binary class matrices y_train = keras.utils.to_categorical(y_train, num_classes) y_test = keras.utils.to_categorical(y_test, num_classes) x_train shape: (60000, 28, 28, 1) 60000 train samples 10000 test samples %matplotlib inline import matplotlib.pyplot as plt import numpy as np import time # this function is used to update the plots for each epoch and error def plt_dynamic(x, vy, ty, ax, colors=['b']): ax.plot(x, vy, 'b', label="Validation Loss") ax.plot(x, ty, 'r', label="Train Loss") plt.legend() plt.grid() fig.canvas.draw() import tensorflow as tf from tensorflow.keras.applications import ResNet50 from tensorflow.keras import layers # Load the pre-trained ResNet model resnet = ResNet50(weights='imagenet', include_top=False, input_shape=(224, 224, 3)) # Create your own classification model model = tf.keras.Sequential() model.add(layers.Conv2D(3, (1, 1), input_shape=(28, 28, 1))) model.add(layers.Rescaling(1./255)) model.add(layers.Conv2D(3, (3, 3), padding='same')) model.add(layers.UpSampling2D((8, 8))) model.add(resnet) model.add(layers.GlobalAveragePooling2D()) model.add(layers.Dense(256, activation='relu')) model.add(layers.Dense(num_classes, activation='softmax')) model.compile(loss=keras.losses.categorical_crossentropy, optimizer=keras.optimizers.Adadelta(), metrics=['accuracy']) https://colab.research.google.com/drive/1Ysj9o5tyJs1DphrPMNRCsysLnYPJI-a-#scrollTo=i0mPdqCdZc00&printMode=true 2/4 5/27/23, 12:26 PM 211668_CI_Project_Resnet1 (1).ipynb - Colaboratory history = model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, verbose=1, validation_data=(x_test, y_test)) score = model.evaluate(x_test, y_test, verbose=0) print('Test loss:', score[0]) print('Test accuracy:', score[1]) Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/resnet/resnet50_weights_tf_dim_ordering_tf_kernels_no 94765736/94765736 [==============================] - 6s 0us/step Epoch 1/10 469/469 [==============================] - 726s 1s/step - loss: 1.3734 - accuracy: 0.6362 - val_loss: 3.2879 - val_accuracy: 0.1135 Epoch 2/10 469/469 [==============================] - 693s 1s/step - loss: 0.3971 - accuracy: 0.9366 - val_loss: 3.4410 - val_accuracy: 0.1135 Epoch 3/10 469/469 [==============================] - 693s 1s/step - loss: 0.1865 - accuracy: 0.9631 - val_loss: 2.9572 - val_accuracy: 0.1237 Epoch 4/10 469/469 [==============================] - 683s 1s/step - loss: 0.1204 - accuracy: 0.9742 - val_loss: 1.4257 - val_accuracy: 0.5896 Epoch 5/10 469/469 [==============================] - 693s 1s/step - loss: 0.0901 - accuracy: 0.9794 - val_loss: 0.0811 - val_accuracy: 0.9811 Epoch 6/10 469/469 [==============================] - 693s 1s/step - loss: 0.0714 - accuracy: 0.9832 - val_loss: 0.0659 - val_accuracy: 0.9831 Epoch 7/10 469/469 [==============================] - 692s 1s/step - loss: 0.0593 - accuracy: 0.9861 - val_loss: 0.0575 - val_accuracy: 0.9851 Epoch 8/10 469/469 [==============================] - 695s 1s/step - loss: 0.0504 - accuracy: 0.9882 - val_loss: 0.0517 - val_accuracy: 0.9865 Epoch 9/10 469/469 [==============================] - 693s 1s/step - loss: 0.0434 - accuracy: 0.9895 - val_loss: 0.0475 - val_accuracy: 0.9874 Epoch 10/10 469/469 [==============================] - 681s 1s/step - loss: 0.0383 - accuracy: 0.9909 - val_loss: 0.0442 - val_accuracy: 0.9877 Test loss: 0.044158075004816055 Test accuracy: 0.9876999855041504 import matplotlib.pyplot as plt %matplotlib inline print('Test score:', score[0]) print('Test accuracy:', score[1]) fig,ax = plt.subplots(1,1) ax.set_xlabel('epoch') ; ax.set_ylabel('Categorical Crossentropy Loss') # list of epoch numbers x = list(range(1,epochs+1)) vy = history.history['val_loss'] ty = history.history['loss'] plt_dynamic(x, vy, ty, ax) Test score: 0.044158075004816055 Test accuracy: 0.9876999855041504 https://colab.research.google.com/drive/1Ysj9o5tyJs1DphrPMNRCsysLnYPJI-a-#scrollTo=i0mPdqCdZc00&printMode=true 3/4 5/27/23, 12:26 PM 211668_CI_Project_Resnet1 (1).ipynb - Colaboratory check 0s completed at 12:24 PM https://colab.research.google.com/drive/1Ysj9o5tyJs1DphrPMNRCsysLnYPJI-a-#scrollTo=i0mPdqCdZc00&printMode=true 4/4