Transfer Learning for Image Recognition Using Pre-Trained Models

Transfer Learning is a Deep Learning technique where a model developed for a task is reused as the initial point for a model on another domain. Instead of train the model from scratch, it is better to reuse the pre-trained model which are trained on a large dataset.

Transfer Learning widely used to solve computer vision and Natural Language Processing related tasks. Transfer Learning methods improve the performance of a neural network. Deep Neural Networks trained on very large scale dataset like ImageNet and COCO have used for transfer learning.

A Pre-trained model is a saved model that was previously trained on a very large dataset.  Due to the limitation of computational power, we can not train a very large neural network on a large dataset. These models can be used for prediction, feature extraction, and fine-tuning.

In this tutorial, we will discover how to use Pre-Trained model to classify the object from an image. Here we will use the ResNet50 Pre-Trained Model for Image classification.

You can find the tutorial of Transfer Learning via Feature Extraction.

ResNet50 is a convolutional neural network that is trained on more than a million images from the ImageNet database. The network is 50 layers deep and can classify images into 1000 object categories, such as a keyboard, mouse, pencil, and many animals. The input image size is 224*224 for a ResNet50 model.

.     .     .

Develop an Image classifier using Pre-Trained Model

Let’s see the implementation of transfer learning for image recognition using ResNet50 Pre-Trained Model with Keras.

Load the ResNet50 Model in Keras

First of all, we will load the ResNet50 model in the Keras Deep Learning library. Keras provides an Application Interface for loading and using pre-trained models. The model can be created as follows:

from keras.applications import resnet50
model = resnet50.ResNet50()

Keras will download the weight file for ResNet50 model if you are executing the first time. It will store the weight in your local machine.

The weight file size is approx 90 MB, So download may take some time. The weights are downloaded once. The next time when you create the model, the weights are loaded locally.

Let’s print the summary of the ResNet50 model as follows:

In [1]:
from keras.applications import resnet50
model = resnet50.ResNet50()
print(model.summary())

This will print the details of all layers of the ResNet50 network.

Load & Prepare Image

Let’s make the class prediction of the below image. you can download this image and save it in current working directory with name test_img.jpg.

Here, we will use the Keras’ load_img() method to load an image and resize an image to 224*224 pixels.

In [2]:
from keras.preprocessing.image import load_img
image = load_img('test_img.jpg', target_size=(224, 224))

Next, we will convert an image to numpy array using the img_to_array() method.

In [3]:
from keras.preprocessing.image import img_to_array
image = img_to_array(image)
print(image.shape)

Out[3]:
(224, 224, 3)

The ResNet50 model expects the input array to be 4-dimensional such as [samples, rows, columns, channels]. Here, in our case, the input sample is only one image with shape [rows, columns, channels]. So, we need to add an extra dimension by calling the reshape() method.

you can also use numpy’s expand_dims() function to add an extra dimension to input array.

In [4]:
re_image = image.reshape((1, image.shape[0], image.shape[1], image.shape[2]))
print(re_image.shape)

Out[4]:
(1, 224, 224, 3)
In [5]:
import numpy as np
ex_image = np.expand_dims(image,axis=0)
print(ex_image.shape)

Out[5]:
(1, 224, 224, 3)

Next, we will prepare the test image pixel in the same way as the ImageNet training data was prepared using Keras preprocess_input() function. This function prepare new input for the network.

In [6]:
image = resnet50.preprocess_input(re_image)

Now an image array is ready to predict the class label. Let’s make a prediction.

Make a Prediction

Next, we will use predict() function to make a prediction on new observation. It will return the probability of the image belonging to each of the 1000 classes.

In [7]:
prediction = model.predict(image)

Here, we want the most likely class label out of 1000 classes in which an input image belongs. Keras provides a function to interpret the probabilities called decode_predictions(). It can return a list of classes with their probabilities.

In [8]:
label = resnet50.decode_predictions(prediction)
print(label)

Out[8]:
[[('n04557648', 'water_bottle', 0.99991035),
  ('n04560804', 'water_jug', 4.238839e-05),
  ('n03825788', 'nipple', 2.7651658e-05),
  ('n03983396', 'pop_bottle', 1.7307366e-05),
  ('n03916031', 'perfume', 1.1240984e-06)]]

Let’s get the most likely class label as follows:

In [9]:
most_label = label[0][0]
print('Predicted class: %s  (%.2f%%)' % (most_label[1], most_label[2]*100))

Out[9]:
Predicted class: water_bottle  (99.99%)

Leave a Reply

Your email address will not be published. Required fields are marked *

Computer Vision Tutorials