Help The Education Support Forum through MathsGee serve learners across Africa with a DONATION

1 like 0 dislike
21 views
I am struggling to find a multivariable generalization for the KNN regression algorithm. | 21 views

0 like 0 dislike

KNN  regression algorithm is a non-parametric (meaning there is no assumed underlying distribution in the data) that is used for estimating continuous variales.

As an input your have a multivariate feature space which happen to be vectors that each have has a class label.  This implies that this is a supervised learning algorithm.

KNN for example uses a weighted average of the k nearest neighbours, weighted by the inverse of their distance. The most commonly used distance is Euclidean that is given by the general formula

$$d = \sqrt{(x-x_{0})^2+(y-y_{0})^2}$$

but sometimes the Mahalanobis distance is preferred.

The algorithm generally works is the following way:

1. Compute the Euclideam or Mahalanobis distance from the query example to the labeled examples
2. Order the labeled examples by increasing distance (ranking)
3. Find a hueristically optimal number $k$ of nearest neighbours, base on the Root Mean Square Error (RMSE). This is achieved through cross-validation.
4. Calculate an inverse distance weighted average with the $k$-nearest multivariate neighbours.
by Diamond (49,546 points)
selected by

0 like 0 dislike