k Nearest Neighbor Regression
k Nearest Neighbor Regression |
Description | k Nearest Neighbor (KNN) Regression enables you to predict new data points based on the known classification of other points. In kNN, we take a bunch of labeled points and then learn how to label other points. |
Why to use | To predict the classification of a new data point using data with multiple classes. |
When to use | - When the dataset under consideration is small (less number of data points).
- When data points are continuous.
| When not to use | - When your dataset contains a large number of data points.
- When data points are discrete.
|
Prerequisites | - Data points should be continuous
- The dataset should not have any missing values
|
Input | Any numerical data. | Output | Predicted classification of a new data point. |
Statistical Methods used | - Euclidean Distance
- Manhattan Distance
- Minkowski Distance
| Limitations | - It cannot be used on data other than numerical.
- Since it uses lazy learning, it is slower.
|
The k-nearest neighbor is a simple and easy-to-use supervised machine learning (ML) algorithm that can be applied to solve regression and classification problems. It assumes that similar things (for example, data points with similar values) exist in proximity. It combines simple mathematical techniques with this similarity to determine the distance between different points on a graph.
The input consists of the k number of training samples that are closest to each other. The output, a class membership, depends on whether the algorithm is being used for regression or classification. In the case of regression, the mean of k labels is returned, while in the case of classification, the mode of k labels is returned.
Classification is done by a vote of majority of the k nearest neighbors, and the new data point is assigned to the class among its k closest neighbors.
Related Articles
Regression
Regression is predictive modeling. It is a statistical method, used in finance, investment, and other disciplines, that attempts to determine the strength and character of the relationship between one dependent variable (usually denoted by Y) and a ...
Poisson Regression
Poisson Regression Description Poisson Regression is a type of linear regression used to model the countable data. Why to use For regression analysis of count data When to use For numerical variables When not to use For textual variables ...
Polynomial Regression
Polynomial Regression Description Polynomial Regression is a supervised learning method in which the relationship between the independent and dependent variables is modeled as an nth degree polynomial. Why to use Predictive Modeling When to use When ...
Ridge Regression
Ridge Regression Description Predict and analyze data points as output for multiple regression data that suffer from multicollinearity by controlling the magnitude of coefficients to avoid over-fitting. Why to use Predictive Modeling When to use To ...
Lasso Regression
Lasso Regression Description Lasso Regression is used to penalize the regression method to select a subset of variables by imposing a constraint on model parameters. Why to use Predictive Modeling When to use For variables having high ...