What is K-Nearest Neighbor(KNN) ?
An introduction to machine learning algorithms
K-Nearest Neighbor(KNN) algorithm is a poplar model and falls under the Supervised Learning and it can be used to solve both classification and regression problems.
In this article, I would be giving you a detailed explanation and how this model works.
What is K-Nearest Neighbor(KNN)?
K-Nearest Neighbor is one of the simplest Machine Learning algorithms based on Supervised Learning technique. KNN algorithm assumes the similarity between the new data and available data and put the new case into the category that is most similar to the available categories.
How KNN works for classification problems?
imagine we have a data looking like this and in this given data we have two types of values in our graph i.e. green and yellow
now let’s say we have a new point and we have to predict it’s value that whether it will come in green category or it will come in yellow category
Now we’ll define K, K is the number of neighbours we are examining; in this case, K=5, which implies that the new point or blue point will view 5 nearest neighbours, indicating that it will see the category of 5 nearest points, which indicates that the point belongs to that category.
here we see the 5 nearest points to the point are 3 yellow and 2 blue so the point is more likely to be a yellow
How KNN works for regression problems?
now imagine we have a dataset like this and in this dataset we have to find the value of the star
now we will be taking the value of K here I am taking 3 and it will now find the 3 nearest value and hence the values are 55, 50, and 51 and will be taking average to find the value of the point
How we calculate distance between points?
to calculate distance between two points we will find the Euclidian distance between the point the formula to calculate the Euclidian distance is
D²=(X2²- X1¹)+(Y2²-Y1²)
square-rooting both side we will get the value of D
How to select best value for K
the whole model working depends on the value of K if. The value of the K is very important. these are some techniques that can help you to determine the value of K.
- first of all there is no universal value of K
- first of all the value of K should be an odd value
- the value of K should not be less or more
- always hit and trial more values of K to be sure and get better accuracy
Conclusion
so I hope today you guys have a good understanding of K-Nearest Neighbour (KNN) in the near future I would be making more articles in which I will be explaining more models and would be making an article to make implement KNN with source code.