Ridge Regression | |||
Description | Predict and analyze data points as output for multiple regression data that suffer from multicollinearity by controlling the magnitude of coefficients to avoid over-fitting. | ||
Why to use | Predictive Modeling | ||
When to use | To regularize the regression, if the Sum of Squared Residuals is too high or too low. | When not to use | On Textual data. |
Prerequisites |
| ||
Input | Any continuous data | Output | The predicted value of the dependent variables. |
Statistical Methods used |
| Limitations | It cannot be used on textual data. |
Regularization techniques are used to create simpler models from a dataset containing a considerably large number of features. Regularization solves the problem of over-fitting to a great extent and helps in feature selection.
Initially, L1 regularization (Lasso Regression) reduces the number of features by decreasing the coefficients of less important features to zero. After that, the L2 regularization, also called the Ridge Regression, introduces a penalty term to further reduce the magnitude of the remaining features' coefficients. The addition of penalty decreases the difference between the actual and the predicted observations.
Thus, Ridge regression solves the problem of multicollinearity in linear regression. Multicollinearity results when independent variables in a regression model are found to be correlated, and this can have a negative impact on the model fitting and interpretation of results.
Hence, when the magnitude of coefficients is pushed close to zero, the models work better on new datasets and are better optimized for prediction.