Random Forest

Random Forest

Random Forest

Description

  • Random Forest is a Supervised Machine Learning algorithm. It works on the Bagging (Bootstrap Aggregation) principle of the Ensemble technique. Thus, it uses multiple models instead of a single model to make predictions.
  • It is extensively used to solve Classification and Regression problems.
  • In the case of Classification, Random Forest builds multiple decision trees, trains them using the Bagging principle, and generates an output based on a majority vote.

Why to use

To predict a class label based on input data. In other words, to identify data points and separate them into categories.

When to use

When you have a numerical data

When not to use

When you have textual data or data without categorical variables
Note: You can add categorical variables by using the label encoding technique.

Prerequisites

  • The dataset should have at least one categorical variable.

Input

Numerical data containing at least one categorical variable

Output

Labelled or classified data

Statistical Methods used

  • Accuracy
  • Sensitivity
  • Specificity
  • F-score
  • ROC chart
  • Lift Curve
  • Confusion Matrix

Limitations

  • It is slow working and difficult to interpret compared to a single decision tree.
  • Its accuracy of prediction is low for complex classification problems.

    • Related Articles

    • Random Forest Regression

      Random Forest Regression Description Random Forest Regression is an ensemble learning method that combines multiple decision trees to create a powerful predictive model for continuous target variables. It utilizes random feature selection to improve ...
    • Isolation Forest

      Isolation Forest Description Isolation Forest is an unsupervised algorithm used for anomaly detection that isolates the anomalies rather than building a model of normal instances. Why to use Isolation forest detects anomalies faster and requires less ...
    • Rubiscape Autumn '20

      New Features Platform & Studio Dataset: S3 dataset – Ability to create, edit, delete S3 dataset SAP HANA – Ability to create, edit, delete HANA dataset Algorithms added: Factor Analysis PCA MLP Neural Network Regression Ridge Regression Lasso ...
    • Rubiscape Winter '19

      New Features Platform & Studio New dataset creation feature for Twitter, PostgresSQL, SQL, MySQL, Oracle, Excel, CSV, Google News. Create dataset from a local TXT file using delimiter option. Supported delimiters are Semicolon, Pipe, Comma, Tab, ...
    • Rubiscape Spring '24

      Published On: 18 June 2024 New Features Rubiscape Workspace Level Export/Import: Workspace export functionality available for tenant admin users. Rubiscape users can import required entities into any existing or new workspace. Rubiscape File Server ...