Decision Tree Regression

Decision Tree Regression

Decision Tree Regression is located under Machine Learning ( ) > Regression > Decision Tree Regression
Use the drag-and-drop method (or double-click on the node) to use the algorithm in the canvas. Click the algorithm to view and select different properties for analysis.

Properties of Decision Tree Regression

The available properties of the Decision Tree Regression are as shown in the figure below.

The table given below describes the different fields present on the properties pane of Decision Tree Regression.

Field

Description

Remark

Run

It allows you to run the node.

Explore

It allows you to explore the successfully executed node.

Vertical Ellipse

The available options are

  • Run till node
  • Run from node
  • Publish as a model
  • Publish code

Task Name

It displays the name of the selected task.

You can click the text field to edit or modify the task's name as required.

Dependent Variable

It allows you to select the variable from the drop-down list for which you need to predict the values of the dependent variable y.

  • Only one data field can be selected.
  • Only Numerical data fields are visible.

Independent Variable

It allows you to select the experimental or predictor variable(s) x.

  • Multiple data fields can be selected.
  • Set encoding method for categorical variables.

Advanced



Criterion

It allows you to select the Decision-making criterion to be used.

  • It is a tree-specific parameter.
  • It decides the quality of the split.
  • The available options are:
    • squared_error (default)
    • absolute_error
    • friedman_mse
    • poisson

Maximum Features

It allows you to select the maximum number of features to be considered for the best split.

  • The available options are:
    • auto
    • sqrt
    • log2
    • None (default)
  • auto, it uses sqrt by default.
  • sqrt, it takes the square root of the number of independent variables as maximum features.
  • log2, it takes the logarithm of the number of independent variables as maximum features.
  • None, it considers all the independent variables as the maximum features.

Random State

It allows you to enter the seed of the random number generator.

-

Maximum Depth

It allows you to enter the maximum tree depth for base learners.

The default value is "None".

Minimum Samples Leaf

The minimum number of samples (data points) required to create a leaf node in each decision tree.

The default value is 1.

Minimum Samples Split

It controls the minimum number of samples required to split an internal node (a decision tree node) into child nodes.

The default value is 2.

Splitter

It allows you to select the criterion to determine how the data is divided at each internal node of the tree.

Values are:
Best - It selects the most relevant features of the dataset. (default)
Random - It selects random features of the dataset. 

Dimensionality Reduction

It allows you to select the dimensionality reduction option.
There are two parameters available

  • None
  • Principal Component Analysis (PCA)
  • If you select PCA, another field is created for inserting the variance value.
  • Enter a suitable value between 0.0 to 1.0 for variance.

Add result as a variable

It allows you to select whether the result of the algorithm is to be added as a variable.

For more details, refer to Adding Result as a Variable.

Node Configuration

It allows you to select the instance of the AWS server to provide control over the execution of a task in a workbook or workflow.

For more details, refer to Worker Node Configuration.

Hyper Parameter Optimization

It allows you to select parameters for optimization.

For more details, refer to Hyper parameter Optimization.

Example of Decision Tree Regression

Let's use the penguin dataset to predict a penguin's body mass based on its bill length, flipper length, species, sex, and Iceland.
The snippet below is the workflow and the algorithm properties panel.

The result page of the decision tree regression is shown below.

Below snippet shows the predicted and actual value of the body mass (g).

    • Related Articles

    • Decision Tree

      Decision Tree is located under Machine Learning ( ) in Classification, in the task pane on the left. Use drag-and-drop method to use the algorithm in the canvas. Click the algorithm to view and select different properties for analysis. Refer to ...
    • Random Forest Regression

      Random Forest Regression is located under Machine Learning ( ) > Regression > Random Forest Regression Use the drag-and-drop method (or double-click on the node) to use the algorithm in the canvas. Click the algorithm to view and select different ...
    • Extreme Gradient Boost Regression (XGBoost)

      XGBoost Regression is located under Machine Learning ( ) in Regression, in the left task pane. Use the drag-and-drop method to use the algorithm in the canvas. Click the algorithm to view and select different properties for analysis. Refer to ...
    • Poisson Regression

      Poisson Regression is located under Machine Learning () under Regression, in the left task pane. Use the drag-and-drop method to use the algorithm in the canvas. Click the algorithm to view and select different properties for analysis. Refer to ...
    • Linear Regression

      Linear Regression is located under Machine Learning ( ) in Regression, in the task pane on the left. Use the drag-and-drop method to use the algorithm in the canvas. Click the algorithm to view and select different properties for analysis. Refer to ...