TensorFlow is a Deep Learning library. Generally, Deep Learning practitioner uses Keras Sequential or Functional API to build a deep neural network architecture. We can easily create the neural network model by stacking multiple layers using Keras. However, all the Keras layers have their default behaviour. You don’t have any control over it.
The Good news is that TensorFlow also supports customization with more flexibility to build a model using subclassing the Model class. Where you can create your own feed-forward model with your custom layers design. The beauty of the model customization is that you have full control over every nuance of the model. However, It’s hard to develop a model with customization. But it is worth getting the entire control over the model.
In this tutorial, you will get to know how to create a custom model with custom layers in TensorFlow. You will also discover the custom training and evaluation of the model. This entire tutorial is work under TensorFlow version 2.1.0. Let’s start to build the model with importing TensorFlow package.
Import Required Packages
import tensorflow as tf print(tf.__version__)
2.1.0
Import dataset
For the experimental purpose, here we use iris flower dataset that consists of 3 different types of irises’ (Setosa, Versicolour, and Virginica). This dataset has 4 features length and width of the Sepals and Petals. It is a multi-class classification problem. We need to prepare a Machine Learning model to classify Iris flowers by species.
Let’s load iris dataset. The dataset has 50 samples for each class.
from sklearn import datasets iris = datasets.load_iris() X = iris.data y = iris.target print(f"X.shape = {X.shape}") print(f"y.shape = {y.shape}")
X.shape = (150, 4) y.shape = (150,)
Build the Model
In this section, we create a custom linear layer and model using TensorFlow’s Keras API. To create the custom layer, we will use the Layer class where weight w and b are initialized and also define the computation. And use the Model class to define the custom neural network architecture.
The Layer class
To create a dense (linear) layer, we inherit the Layer class of Keras API. It has weight w and b parameters. You can initialize the weight and bias parameter in both __init__ or build method. However, the weight and bias coefficient are ideally initialized in a build method and some computation is defined in the call method.
This custom dense layer takes a dimension of the input and number of neuron unit as a parameter, that used to defined the shape of the weight and bias coefficient in the build method.
class Dense_Layer(tf.keras.layers.Layer): def __init__(self, units): super(Dense_Layer, self).__init__() self.units = units def build(self,input_shape): self.w = self.add_weight(shape=(input_shape[-1], self.units), initializer='random_normal', trainable=True) self.b = self.add_weight(shape=(self.units,), initializer='random_normal', trainable=True) def call(self, inputs): return tf.matmul(inputs, self.w) + self.b
The Model class
TensorFlow’s Keras API provides Model class that is used to define the model architecture. The Model class comprise multiple subclassing layers via Layer class.
Let’s create the custom model by inheriting the Model class of Keras API. Same as Layer class, the subclassing inner layers are defined in __init__ or build method and computations are defined in the call method.
Below, the example is the three hidden layer architecture where layer initialization took place in __init__ with specified neuron size. These layers are stacked in the call method.
class Custom_Model(tf.keras.Model): def __init__(self): super(Custom_Model, self).__init__() self.dense1 = Dense_Layer(50) self.dense2 = Dense_Layer(12) self.dense3 = Dense_Layer(3) self.dropout = tf.keras.layers.Dropout(0.2) def call(self,input_tensor,training=False): x = self.dense1(input_tensor) x = tf.nn.relu(x) if training: x = self.dropout(x, training=training) x = self.dense2(x) x = tf.nn.relu(x) x = self.dense3(x) x = tf.nn.softmax(x) return x model = Custom_Model()
Prepare the training data
As we are going to train the model on iris data that is a multi-class classification problem. So, we need to apply the one-hot encoding of the target variable.
from keras.utils import to_categorical one_hot_Y = to_categorical(y) print("One hot Encoding --> target shape :",one_hot_Y.shape)
One hot Encoding --> target shape : (150, 3)
Split train & validation data
from sklearn.model_selection import train_test_split X_train, X_val, Y_train, Y_val = train_test_split(X,one_hot_Y,test_size=0.2,random_state = 42) print(f"Training data : X_train.shape : {X_train.shape} & Y_train.shape : {Y_train.shape}") print(f"Validation data : X_val.shape : {X_val.shape} & Y_val.shape : {Y_val.shape}")
Training data : X_train.shape : (120, 4) & Y_train.shape : (120, 3) Validation data : X_val.shape : (30, 4) & Y_val.shape : (30, 3)
Define Loss function
To train the model, we need loss function that helps to evaluate the model performance. That measure how much the difference between the model’s predicted target value and actual target value. Our goal is to minimize this loss value while model training.
Here, we use CategoricalCrossentropy loss function that takes the model’s predicted class probability and actual target value as input and return the loss value.
loss_fn = tf.keras.losses.CategoricalCrossentropy()
Define a Gradient function
Calculate the gradients of the loss function that is partial derivatives with respect to the model’s parameters. Gradients are used to optimize the model. The tf.GradientTape() use to calculate gradient. Let’s set up the Gradient function.
with tf.GradientTape() as tape: logits = model(x_batch_train) # Model's prediction loss_value = loss_fn(y_batch_train, logits) # Calcuate Loss grads = tape.gradient(loss_value, model.trainable_weights) # Calculate Gradients
Define an Optimizer
We required optimizer to minimize the loss. Optimizer use computed gradient of a loss function to minimize the loss. TensorFlow provides various optimization algorithms. Here we use the Adam optimizer with setting learning rate 0.01
optimizer = tf.keras.optimizers.Adam(learning_rate=0.01)
Define an Accuracy Metric
Let’s set up an accuracy metric of training and validation to observe the performance of the model.
train_acc_metric = tf.keras.metrics.CategoricalAccuracy() val_acc_metric = tf.keras.metrics.CategoricalAccuracy()
Define Batch dataset of train & validation
To train the model, the number of samples is executed at a time that is called a batch sample. We iteratively execute the train function on each batch. Let’s define the batch size is 16, that means 16 samples in a batch.
# Prepare the training dataset. batch_size = 16 train_dataset = tf.data.Dataset.from_tensor_slices((X_train, Y_train)) train_dataset = train_dataset.shuffle(buffer_size=1024).batch(batch_size) val_dataset = tf.data.Dataset.from_tensor_slices((X_val, Y_val)) val_dataset = val_dataset.batch(batch_size)
Define Training Loop
Let’s set up the training loop with many steps.
- Iterate each epoch. one epoch means one pass through entire train dataset.
- Within each epoch iteration, model training calculation will be performed on each train batches.
- Run the forward-pass calculation and predict the model output.
- Calculate the Loss and gradient of the model.
- Run an optimizer to update the parameters of the model.
- Display Log record of Loss value.
- Repeat the same procedure for each epoch.
# defne epoch epochs = 5 for epoch in range(epochs): print("\n") print(f"Epoch : {epoch+1}") # Iterate over the batches of the train dataset for step, (x_batch_train, y_batch_train) in enumerate(train_dataset): # During forward pass, Open GradientTape to calculate the gradient with tf.GradientTape() as tape: logits = model(x_batch_train,training=True) # Calcuate Model's Prediction loss_value = loss_fn(y_batch_train, logits) # Calculate Loss value # Retrieve Gradient Calculation grads = tape.gradient(loss_value, model.trainable_weights) # Run the Optimizer, that update the model parameters to minimize the loss optimizer.apply_gradients(zip(grads, model.trainable_weights)) # Update training accuracy metric train_acc_metric(y_batch_train, logits) # Print Log of loss value at every 5th step if step % 5 == 0: print(f"Training Loss at step {step} : {loss_value:.3f}") print() # print training accuracy at the end of each epoch train_acc = train_acc_metric.result() print(f"Training Accuracy : {train_acc:.3f}") # Reset training metrics at the end of each epoch train_acc_metric.reset_states() # Run model on validation data at the end of each epoch for x_batch_val, y_batch_val in val_dataset: val_logits = model(x_batch_val) val_acc_metric(y_batch_val, val_logits) # Display validation accuracy val_acc = val_acc_metric.result() # Reset validation metric val_acc_metric.reset_states() print(f"Validation Accuracy : {val_acc:.3f}")
Epoch : 1 Training Loss at step 0 : 1.092 Training Loss at step 5 : 1.049 Training Accuracy : 0.367 Validation Accuracy : 0.533 Epoch : 2 Training Loss at step 0 : 0.978 Training Loss at step 5 : 0.837 Training Accuracy : 0.642 Validation Accuracy : 0.700 Epoch : 3 Training Loss at step 0 : 0.740 Training Loss at step 5 : 0.545 Training Accuracy : 0.658 Validation Accuracy : 0.700 Epoch : 4 Training Loss at step 0 : 0.466 Training Loss at step 5 : 0.441 Training Accuracy : 0.783 Validation Accuracy : 0.967 Epoch : 5 Training Loss at step 0 : 0.332 Training Loss at step 5 : 0.291 Training Accuracy : 0.917 Validation Accuracy : 0.967
Evaluate the Model
Let’s evaluate the model performance on the test data set. Here, is the test samples.
test_dataset = tf.convert_to_tensor([ [5.8, 2.7, 3.9, 1.2], [4.7, 3.2, 1.6, 0.2], [7.7, 2.6, 6.9, 2.3], [4.8, 3. , 1.4, 0.1], [6.7, 2.5, 5.8, 1.8] ]) test_dataset.shape
TensorShape([5, 4])
Class_Label = ['Iris setosa', 'Iris versicolor', 'Iris virginica'] # if the model contain layers with different behaviour # during training and inference such as Dropout, use training=False test_predictions = model(test_dataset, training=False) test_probabilities = tf.nn.softmax(test_predictions) predicted_class = tf.argmax(test_probabilities, 1) predicted_class_label = tf.gather(Class_Label, predicted_class) for ex,pred in zip(test_dataset,predicted_class_label): tf.print(ex,pred)
[5.8 2.7 3.9 1.2] "Iris versicolor" [4.7 3.2 1.6 0.2] "Iris setosa" [7.7 2.6 6.9 2.3] "Iris virginica" [4.8 3 1.4 0.1] "Iris setosa" [6.7 2.5 5.8 1.8] "Iris virginica"
In this tutorial, you have explored to build custom neural network model with custom layers using TensorFlow’s Keras API.
. . .