---
tags: Optimization Algorithms,K-NN
---
# k-Nearest Neighbors algorithm(K-NN)
- It is a Supervised learning algorithm
- K-NN used to solve classification problem.
- K-NN algorithm is classification algorithm that takes a bunch of labelled data, and use them to learn how to label other points
- Cases that are near each other are said to be “neighbours”
- K-NN comes under lazy learner category.
## K-nearest Neighbors algorithm steps:
1. Pick a value of k
3. Calculate distance of unkonwn cases from all cases using Euclidian distance
$$Eculidian\ distance= \sqrt{(x_2 - x_1)^2 +(y_2 - y_1)^2 }$$
4. Select the nearest neighbours based on calculated distance
5. Predict the unknown data point using the most voted value from nearest neighbour
## Adavantages:
Easy for implementation
More effective if training data is large
## Disadavantages:
Selecting v value is challenging
Computation east is high
## Application
Pattern regression
sensitive to scale data
## Example
Let's assume this is sample for analysis
| $$ x_1 $$ | $$ x_2 $$ | classification |
| -------- | -------- | -------- |
| 7 | 7 | bad |
| 7 | 4 | bad |
| 3 | 4 | good |
| 1 | 4 | good |
$Let's\ predict for\ new \ data\ point \ x_1=3,x_2=7$
**assume** k=3
Using Eculidian formula
$\sum_{i=0}^n \sqrt{(x_2 - x_1)^2 +(x_2 - x_1)^2 }$
| $$ x_1 $$ | $$ x_2 $$ | distance |
| -------- | -------- | -------- |
| 7 | 7 | $$(7-3)^2+(7-7)^2 = 6$$ |
| 7 | 4 | $$(7-3)^2+(4-7)^2 = 25$$ |
| 3 | 4 | $$(3-3)^2+(4-7)^2 = 9$$ |
| 1 | 4 | $$(1-3)^2+(4-7)^2 = 13$$ |
Now sort distance based on calcuated value by compared k=3
| $$ x_1 $$ | $$ x_2 $$ | distance | Rank | k <3 |
| -------- | -------- | -------- |------|--- |
| 7 | 7 | $$(7-3)^2+(7-7)^2 = 6$$ | 1 |yes |
| 7 | 4 | $$(7-3)^2+(4-7)^2 = 25$$ | 4 |no |
| 3 | 4 | $$(3-3)^2+(4-7)^2 = 9$$ | 2 | yes |
| 1 | 4 | $$(1-3)^2+(4-7)^2 = 13$$ | 3 |yes |
Now select the observation which are nearst to k=3 and consider same value from calssificaiton column
| $$ x_1 $$ | $$ x_2 $$ | distance | Rank | k <3 | calssificaiton of nearest neghbour|
| -------- | -------- | -------- |------|--- | -------
| 7 | 7 | $$(7-3)^2+(7-7)^2 = 6$$ | 1 |yes | Bad
| 7 | 4 | $$(7-3)^2+(4-7)^2 = 25$$ | 4 |no | --
| 3 | 4 | $$(3-3)^2+(4-7)^2 = 9$$ | 2 | yes | Good
| 1 | 4 | $$(1-3)^2+(4-7)^2 = 13$$ | 3 |yes | Good
Now by considering simple majority
Good --> 2 and Bad --> 1 we can conclude for new data point $x_1 =3 and x_2 =7$ is belongs to Good category.