Johnson Transformation

Johnson Transformation

Johnson Transformation

DescriptionThe Johnson transformation is a statistical technique that transforms non-normal data into a normal distribution. It extends the Box-Cox transformation and can handle positively and negatively skewed data.

Why to use

To normalize the distribution of a continuous variable.

When to use

  • Non-normality of data

When not to use

  • Data with extreme outliers
  • Categorical or ordinal data
  • Non-linear relationships

Prerequisites

  • A data with non-normal distribution.

Input

Any dataset containing numerical variables.

 Output

A transformed version of the original dataset.

Statistical Methods Used

  • Maximum Likelihood Estimation (MLE)
  • Moment Estimation

 Limitation

  • Data range
  • Parameter estimation
  • Sensitivity to outliers.

Johnson Transformation or Yeo-Johnson Transformation is a statistical method to convert a numerical variable such that its distribution is more closer to normal distribution. Unlike Box-Cox transformation which is not applicable on negative values, Johnson transformation can handle negative values.
The following formula describes the Johnson transformation


    • Related Articles

    • Box-Cox Transformation

      Box-Cox Transformation Description The Box-Cox transformation is a mathematical technique that transforms a non-normal or skewed dataset into a more normal distribution. Why to use Normality Homoscedasticity Linearity When to use Non-Normal data When ...
    • Rubiscape Winter '22

      New Features Platform & Studio On-Prem Autoscaling Support for horizontal autoscaling for on-prem deployments of Rubiscape. Data Cleaning Ability to fix common data quality issues such as remove/replace null data, remove punctuations, capitalization, ...
    • Pre-Processing

      It involves data cleaning, data transformation, and data reduction. Every textual data may not be ready Data preprocessing is a data mining technique that involves transforming raw data into an understandable and useful format. Real-world data is ...
    • Pre Processing

      In its general sense, data preprocessing is a data mining technique to transform raw data into useful and analyzable form. It involves data cleaning, data transformation, and data reduction. With respect to textual analysis, pre-processing involves ...
    • Time-series Data Preparation

      Time-series Data Preparation organizes and formats transactional data into time-series data to predict trends and seasonality in the data. Transactional data is timestamped data recorded over a period at no specific frequency, while time-series data ...