Random Forest

Random forest is a supervised machine learning algorithm which works by building a series of Decision Trees.  A random forest is a meta estimator that fits a number of decision tree classifiers on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting. Random forest algorithm is used for both classification regression problem.

If you don’t have sufficient knowledge of Deciosn Tree, please refer to this tutorial of Decision Tree. As Decision Tree is the building block of Random forest, you should aware of how Decision Tree works.

Scikit-Learn library provides the implementation of the Random Forest algorithm. It provides a variety of parameters. This tutorial has explained about how Random forest work with an example.

Let’s built a Random Forest classifier on real-life data and visualize how a tree looks like. The Data contain the height and width of the house in square feet and need to predict the shape of the house such as regular or irregular. Here the target variable is binary:

Target :

  • 0 – Irregular shape of a house
  • 1 – Regular shape of a house

 

Width(W) Height(H) target
5 5 1
4 3 1
1 5 0
9 7 1
2 7 0
6 5 1
9 3 0
8 4 0

Let’s create a Random Forest classifier on the above data using Scikit-Learn library.

from sklearn.ensemble import RandomForestClassifier
from sklearn.tree import export_graphviz
import pandas as pd
import numpy as np
import graphviz

model = RandomForestClassifier(n_estimators=10)
data={'h':[5,3,5,7,5,7,3,4],'w':[5,4,6,9,1,2,9,8],'target':[1,1,1,1,0,0,0,0]}
df = pd.DataFrame(data)
df = df.sample(8)

train_x = df.drop('target',axis=1)
train_y = df['target']
model.fit(train_x,train_y)

Visualization a Tree

Here, the Random forest classifier built 10 trees with random samples, as we have passed the value of  n_estimator is 10. Let’s see the graphical visualization of the first tree of Random forest. Please refer this tutorial for a detailed understanding of visualizing tree using Scikit-learn library.

#Visualization of tree
estimator = model.estimators_[0]

# Export as dot file
export_graphviz(estimator, out_file='tree1.dot', 
                feature_names = ['h','w'],
                class_names = ['0','1'],
                rounded = True, proportion = False, 
                precision = 2, filled = True)

# Convert to png using system command (requires Graphviz)
from subprocess import call
call(['dot', '-Tpng', 'tree1.dot', '-o', 'tree1.png', '-Gdpi=600'])

# Display in jupyter notebook
from IPython.display import Image
Image(filename = 'tree1.png')

Each tree will be different in Random Forest. Here, we have plot only a single tree. We can also add a constraint to grow trees using the parameters of the Scikit-Learn Random Forest model such as max_depth, min_samples_leaf, etc. Please refer this documentation to understand the parameters thoroughly.

Leave a Reply

Your email address will not be published. Required fields are marked *

Machine Learning Model Tutorials

Content-Based Recommendation System

Face verification on Live CCTV IP camera feed using Amazon Rekognition Video and Kinesis Video Streams

AWS Rekognition: Face detection & analysis of an image

Stream CCTV IP camera (RTSP) feed into AWS Kinesis Video Streams

Model Quantization Methods In TensorFlow Lite

Introduction to TensorFlow Lite

TensorFlow : Prepare Custom Neural Network Model with Custom Layers

Regularization Techniques: To avoid Overfitting in Neural Network

Setting Dynamic Learning Rate While Training the Neural Network

Neural Network: Introduction to Learning Rate

Mathematics behind the Neural Network

Implementation of Neural Network from scratch using NumPy

How Neural Network works?

Gradient Descent with Momentum in Neural Network

Gradient Descent in Neural Network

Activation Functions in Neural Network

Introduction to Neural Network

K-Nearest Neighbors (KNN)

Support Vector Machine (SVM)

Logistic Regression

Linear Regression

Decision Tree

Introduction to Machine Learning Model

Performance Measurement Metrics to Evaluate Machine Learning Model

Essential Mathematics for Machine Learning

Applications of Machine Learning