Lasso Regression

Lasso Regression

Lasso Regression

Description

Lasso Regression is used to penalize the regression method to select a subset of variables by imposing a constraint on model parameters.

Why to use

Predictive Modeling

When to use

For variables having high multicollinearity

When not to use

On textual data.

Prerequisites

  • If the data contains any missing values, use Missing Value Imputation before proceeding with Lasso Regression.
  • If the input variable is of categorical type, use Label Encoder.
  • The output variable must be a continuous data type.
  • Linearity – The relationship between the dependent and independent variables is linear.
  • Independence – The variables should be independent of each other.
  • Normality – The variables should be normally distributed.
  • The Dependent variable (Y) vs. Residuals plot must not follow a pattern.
  • The errors should be normally distributed.

Input

Any continuous data

Output

The predicted value of the dependent variable

Statistical Methods used

Dimensionality Reduction

Limitations

It cannot be used on textual data.


Lasso is the abbreviation of Least Absolute Shrinkage and Selection Operator. It is a regression analysis method that uses shrinkage to perform both variable selection and regularization. This usage of shrinkage increases the prediction accuracy and interpretability of a statistical model.
Shrinkage refers to the fact that the data values shrink towards a central point, like the mean. Lasso regression performs L1 regularization in which a penalty equal to the absolute value of the magnitude of coefficients is added. In this case, some coefficients become zero and are eliminated from the model. Also, heavy penalties result in coefficients with values close to zero. This produces simpler models that are easy to analyze.
Thus, Lasso regression encourages simple and sparse models, where models have a smaller number of parameters. It is suitable for models that show great collinearity to automate variable selection or parameter elimination in a model selection.
    • Related Articles

    • Ridge Regression

      Ridge Regression Description Predict and analyze data points as output for multiple regression data that suffer from multicollinearity by controlling the magnitude of coefficients to avoid over-fitting. Why to use Predictive Modeling When to use To ...
    • Regression

      Regression is predictive modeling. It is a statistical method, used in finance, investment, and other disciplines, that attempts to determine the strength and character of the relationship between one dependent variable (usually denoted by Y) and a ...
    • Poisson Regression

      Poisson Regression Description Poisson Regression is a type of linear regression used to model the countable data. Why to use For regression analysis of count data When to use For numerical variables When not to use For textual variables ...
    • Polynomial Regression

      Polynomial Regression Description Polynomial Regression is a supervised learning method in which the relationship between the independent and dependent variables is modeled as an nth degree polynomial. Why to use Predictive Modeling When to use When ...
    • Linear Regression

      Regression is predictive modeling. It is a statistical method used in finance, investment, and other disciplines that attempts to determine the strength and character of the relationship between one dependent variable (usually denoted by Y) and a ...