What is K-Nearest Neighbors Algorithm

As we discussed Classification  in our previous chapter. If you did not go through it, please first read that chapter so it is feasible for you to understand this chapter about (KNN) K-Nearest Neighbors Algorithm which is one of the algorithm in Classification in Machine Learning.

K-Nearest Neighbors Algorithm

What is K-Nearest Neighbors Algorithm in Classification

The K-Nearest Neighbors (KNN) algorithm is a powerful tool or you can say algorithm in machine learning for classification and regression tasks. It’s based on the idea that similar data points tend to have similar outcomes. In this chapter we will break down the KNN algorithm into simple steps, explain the concepts, and provide an illustrative example. Additionally, we’ll cover how to choose the right value of ‘K’ and demonstrate the algorithm with Python code so if you may still have any question on this you can comment in comment section please don’t hesitate.

K-Nearest Neighbors Example

Imagine a telecommunications company that wants to customize offers for its customers based on demographic data. The goal is to predict the service usage group a customer belongs to using features like age and income etc. This is a classification problem, where we aim to build a model that can predict the group for new customers it is just example i am saying see below for the steps involve from start to end for implementing KNN Algorithm.

how K-Nearest Neighbors Algorithm algorithm works

How K-Nearest Neighbors Algorithm works

KNN (K Nearest Neighbors) is an of algorithm in classification in machine learning that works by measuring the similarity between data points or you can say between in dataset. Similar cases tend to have the same class labels. The “K” in K Nearest Neighbors refers to the number of nearest neighbors to consider when making a prediction.

Another K-Nearest Neighbors Algorithm Example

For example we have a dataset of customers with age and income as features etc, and we’ve segmented them into four groups like as Basic Service, E Service, Plus Service, and Total Service. Our objective or goal is to predict the group for a new customer based on his/her age and his/her income. Don’t worry we will see how could we do in Python just be calm.

How to find Euclidean distance

Now we see how to find Euclidean distance between two points. To find the similarity between two data points, we use distance metrics like the Euclidean distance. If we have two customers with features (age1, income1) and (age2, income2) etc, the Euclidean distance between them is calculated as:

distance = sqrt((age2 - age1)**2 + (income2 - income1)**2)

This distance measures how dissimilar or similar two data points are.

K-Nearest Neighbors Algorithm Steps

Now how can we apply KNN Algorithm this is main question so we see steps of doing this:

step 1: Choose a value for ‘K’.
step 2: Calculate the distance between the new data point and all points in the dataset.
step 3: Select the ‘K’ nearest neighbors based on the shortest distances.
step 4: Count the frequency of each class among the ‘K’ neighbors.
step 5: Assign the class label with the highest frequency to the new data point.

How to find ‘k’ in K-Nearest Neighbors Algorithm?

Now next thing is the Choosing the right ‘K’ value. A small ‘K’ can lead to noise, while a large ‘K’ can result in oversimplification. To find the optimal ‘K’, split your data into training and testing sets. Start with ‘K’ equal to 1 and gradually increase it, measuring the accuracy on the testing set. Choose the ‘K’ that gives the best accuracy simple. I hope you understand this thing so we also see this by python.

Python example Scenario

Let’s consider a scenario where we have a scatter plot of customers’ age and income as an feature, divided into the four service groups. We want to predict the group for a new customer with age 35 and income $50,000. See below for Python implementation.

K-Nearest Neighbors Algorithm in Python

Here’s a simple Python code snippet to demonstrate the KNN algorithm using the sklearn library:

from sklearn.neighbors import KNeighborsClassifier
import numpy as np

# Example dataset: age and income of existing customers
X = np.array([[25, 40000], [30, 50000], [22, 35000], [40, 60000]])
# Corresponding labels (service groups)
y = np.array(['Basic', 'E', 'Basic', 'Plus'])

# Create and train KNN model
knn = KNeighborsClassifier(n_neighbors=3)  # Set K = 3
knn.fit(X, y)

# New customer data
new_customer = np.array([[35, 50000]])

# Predict the group for the new customer
predicted_group = knn.predict(new_customer)

print("Predicted Group:", predicted_group)

Output

Predicted Group: ['Basic']
Conclusion

Conclusion

So in last i conclude by saying that the K-Nearest Neighbors (KNN) algorithm is a versatile tool for classification problems. By finding similar data points and making predictions based on their class labels, KNN provides a simple yet effective approach to pattern recognition. Choosing the right ‘K’ value is essential for accurate predictions, and this can be achieved by evaluating the model’s performance on a testing dataset. With the help of Python and libraries like sklearn, implementing and experimenting with the KNN algorithm becomes straightforward and insightful. I hope this chapter is useful and beneficial for you thanks for reading this article.

Link: https://Codelikechamp.com

Medium Link: Follow me on Medium

Linkedin Link: Follow me on Linkedin

🤞 Don’t miss any latest posts!

Please subscribe by joining our community for free and stay updated!!!

IF YOU HAVE ALREADY SUBSCRIBED JUST CLOSE THIS FORM !

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top