Convolutional Neural Networks (CNN) with Keras in Python

This tutorial has explained the construction of Convolutional Neural Network (CNN) on MNIST handwritten digits dataset using Keras Deep Learning library. The MNIST handwritten digits dataset is the standard dataset used as the basis for learning Neural Network for image classification in computer vision and deep learning.

The MNIST dataset contains 28*28 pixel grayscale images of handwritten digits between 0 to 9. It has 60,000 samples for training and 10,000 samples for testing.

.     .     .

Develop a Baseline Model

Keras API provides the built-in MNIST dataset. Let’s load the MNIST dataset using Keras in Python.

In [1]:
from keras.datasets import mnist
(trainX, trainy), (testX, testy) = mnist.load_data()
print('Train Data : X={} Y={}'.format(trainX.shape, trainy.shape))
print('Test Data  : X={} y={}'.format(testX.shape, testy.shape))

Out[1]:
Train Data : X=(60000, 28, 28) Y=(60000,)
Test Data  : X=(10000, 28, 28) y=(10000,)

Let’s plot the few samples from a dataset.

In [2]:
import matplotlib.pyplot as plt
for i in range(9):
    plt.subplot(330 + 1 + i)
    plt.imshow(trainX[i], cmap=plt.get_cmap('gray'))
plt.show()

Out[2]:

In order to develop a baseline model for handwritten digit recognition, we further divide train dataset into twp parts one as train dataset and one as validation dataset. The Keras API supports this by specifying the “validation_dataparameter to the model.fit() method when training the model.

Keras API also Provides “validation_split” parameter in the model.fit()  method which directly split the dataset into a train and validation set. We do not need to provide the validation dataset explicitly.

# Validation by specifying validation_data 
model.fit(..., validation_data=(valX, valY))

# Validation by specifying validation_split 
model.fit(..., validation_split=0.2)
this will split the train dataset and consider 20% data for validation

We need to reshape the data arrays to have a single color channel.

In [3]:
trainX = trainX.reshape((trainX.shape[0], 28, 28, 1))
testX = testX.reshape((testX.shape[0], 28, 28, 1))
print('trainX : {} '.format(trainX.shape))
print('testX  : {} '.format(testX.shape))

Out[3]:
trainX : (60000, 28, 28, 1) 
testX  : (10000, 28, 28, 1)

There are a total of 10 classes for digit between 0 to 1. We use one-hot encoding for class labels. Keras API provides the utility function to_categorical() for one-hot encoding.

In [4]:
from keras.utils import to_categorical
trainY = to_categorical(trainy)
testY = to_categorical(testy)
print('trainY shape : {} '.format(trainY.shape))
print('testY  shape : {} '.format(testY.shape))

Out[4]:
trainY shape : (60000, 10) 
testY  shape : (10000, 10)

Pixel values of an image are in the range between 0 to 255. Generally, to achieve the better performance we need to feed normalized input values to the neural network.

Let’s normalized each pixel values to the range [0,1]. we can normalize input data by first converting the data types to float and followed by dividing pixel values by the maximum value.

In [5]:
train_norm = trainX.astype('float32')
test_norm = testX.astype('float32')
# normalize to range [0,1]
train_norm = train_norm / 255.0
test_norm = test_norm / 255.0

Let’s define a baseline a Convolutional neural network model and train it.

In [6]:
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Dense
from keras.layers import Flatten
from keras.layers import Dense, Dropout, Flatten
from keras.optimizers import SGD
num_classes = 10
def prepare_model():
    model = Sequential()
    model.add(Conv2D(32,kernel_size=(3,3),activation='relu',input_shape=(28, 28, 1)))
    model.add(Conv2D(64, (3, 3), activation='relu'))
    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Dropout(0.25))
    model.add(Flatten())
    model.add(Dense(128, activation='relu'))
    model.add(Dropout(0.5))
    model.add(Dense(num_classes, activation='softmax'))
    model.compile(loss="categorical_crossentropy",optimizer="adam",metrics=['accuracy'])
    return model

In [7]:
model = prepare_model()
model.fit(train_norm, trainY, batch_size=128,validation_split=0.2,epochs=3,verbose=1)

Out[7]:
Train on 48000 samples, validate on 12000 samples
Epoch 1/3
48000/48000 [==============================] - 163s 3ms/step - loss: 0.2748 - acc: 0.9157 - val_loss: 0.0685 - val_acc: 0.9801
Epoch 2/3
48000/48000 [==============================] - 159s 3ms/step - loss: 0.0937 - acc: 0.9721 - val_loss: 0.0457 - val_acc: 0.9872
Epoch 3/3
48000/48000 [==============================] - 181s 4ms/step - loss: 0.0707 - acc: 0.9784 - val_loss: 0.0420 - val_acc: 0.9879

Learning curves

Let’s take a look at the learning curves of the training and validation accuracy and loss.

In [8]:
acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']

plt.figure(figsize=(8, 8))
plt.subplot(2, 1, 1)
plt.plot(acc, label='Training Accuracy')
plt.plot(val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.ylabel('Accuracy')
plt.ylim([min(plt.ylim()),1])
plt.title('Training and Validation Accuracy')

plt.subplot(2, 1, 2)
plt.plot(loss, label='Training Loss')
plt.plot(val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.ylabel('Cross Entropy')
plt.ylim([0,1.0])
plt.title('Training and Validation Loss')
plt.xlabel('epoch')
plt.show()

Out[8]:

Model Evaluation

Let’s evaluate the trained model on test data and observe the accuracy.

In [9]:
score = model.evaluate(test_norm, testY, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

Out[9]:
Test loss: 0.0346724861223207
Test accuracy: 0.9887

Training very deep neural network on a large dataset takes a lot amount of time sometimes it takes a day, weeks. Instead of training model each time, we should save the trained model and used it for prediction.

Please refer to this tutorial to save the trained model and load that model to make a prediction on a new test sample.

.     .     .

Leave a Reply

Your email address will not be published. Required fields are marked *

Computer Vision Tutorials

Prepare COCO dataset of a specific subset of classes for semantic image segmentation

YOLOV4: Train a yolov4-tiny on the custom dataset using google colab.

Video classification techniques with Deep Learning

Keras ImageDataGenerator with flow_from_dataframe()

Keras ImageDataGenerator with flow_from_directory()

Keras ImageDataGenerator with flow()

Keras ImageDataGenerator

Keras fit, fit_generator, train_on_batch

Keras Modeling | Sequential vs Functional API

Save and Load Keras Model

Transfer Learning for Image Recognition Using Pre-Trained Models

An introduction to Transfer Learning

Keras ImageDataGenerator and Data Augmentation

Introduction to Computer Vision