K-Nearest Neighbors (KNN)

[vc_row][vc_column][vc_column_text]K-Nearest Neighbours(KNN) is a very simple, easy to implement, versatile and one of the famous algorithm in Machine Learning field. KNN is a non-parametric supervised Machine Learning algorithm. KNN used for both classification and regression problem. The letter K represents the number of nearest neighbours considers in KNN. It is one of the core factor in KNN. Choosing the best value of K for your data points is a very challenging task.

The predictions are made for new observation by searching through the entire training dataset to find the closest similar neighbour. The computational complexity of KNN increases with the size of the training dataset. The distance measure can be Euclidean distance, Hamming distance, Manhattan distance and Minkowski distance. However, Euclidean distance is most used in KNN.

These above three distance measures are used for continuous variables. In order to fit the KNN model for categorical variables, the Hamming distance is used.

.     .     .

Pseudo Code of KNN

  • Load the data
  • Initialise the value of k i.e how many nearest neighbour data points should consider?
  • To predict the class label for new observation, iterate through each training data.
      1.  Calculate the distance between test observation and each row of training data by using a defined distance measure equation such as Euclidean distance, Hamming distance, Manhattan distance and Minkowski distance.
      2. Sort the calculated distance in ascending order.
      3. Get the top k rows from the sorted list of distance.
      4. Return the most frequent class of these rows as known predicted class.

.     .     .

Example

Let’s plot the training dataset to the demonstration of how the KNN algorithm works:

Now we will classify the new observation with green dot into purple and red class. Here, we will consider the three different values of K = 1,2 and 3 for prediction using the KNN algorithm.[/vc_column_text][/vc_column][/vc_row][vc_row equal_height=”yes”][vc_column width=”1/3″ css=”.vc_custom_1569404234592{padding-right: 0px !important;padding-left: 0px !important;}”][vc_single_image image=”1532″ img_size=”full” alignment=”center”][/vc_column][vc_column width=”1/3″ css=”.vc_custom_1569404243076{padding-right: 0px !important;padding-left: 0px !important;}”][vc_single_image image=”1533″ img_size=”full” alignment=”center”][/vc_column][vc_column width=”1/3″ css=”.vc_custom_1569404253048{padding-right: 0px !important;padding-left: 0px !important;}”][vc_single_image image=”1534″ img_size=”full” alignment=”center”][/vc_column][/vc_row][vc_row][vc_column][vc_column_text]

.     .     .

Pros and Cons of KNN

Pros:

  • Very easy to implement.
  • No separate training phase required.
  • Very useful to non-linear data.

 

Cons:

  • Prediction Can be slow if the training samples are more.
  • Computational is expensive because it store all training samples.
  • Sensitive to the choosen value of parameter K.

.     .     .

[/vc_column_text][/vc_column][/vc_row]

Leave a Reply

Your email address will not be published. Required fields are marked *

Machine Learning Model Tutorials

Content-Based Recommendation System

Face verification on Live CCTV IP camera feed using Amazon Rekognition Video and Kinesis Video Streams

AWS Rekognition: Face detection & analysis of an image

Stream CCTV IP camera (RTSP) feed into AWS Kinesis Video Streams

Model Quantization Methods In TensorFlow Lite

Introduction to TensorFlow Lite

TensorFlow : Prepare Custom Neural Network Model with Custom Layers

Regularization Techniques: To avoid Overfitting in Neural Network

Setting Dynamic Learning Rate While Training the Neural Network

Neural Network: Introduction to Learning Rate

Mathematics behind the Neural Network

Implementation of Neural Network from scratch using NumPy

How Neural Network works?

Gradient Descent with Momentum in Neural Network

Gradient Descent in Neural Network

Activation Functions in Neural Network

Introduction to Neural Network

Support Vector Machine (SVM)

Logistic Regression

Linear Regression

Random Forest

Decision Tree

Introduction to Machine Learning Model

Performance Measurement Metrics to Evaluate Machine Learning Model

Essential Mathematics for Machine Learning

Applications of Machine Learning