Binomial Logistic Regression

Binomial Logistic Regression

Binomial Logistic Regression is located under Machine Learning () in Data Classification, in the task pane on the left. Use drag-and-drop method to use the algorithm in the canvas. Click the algorithm to view and select different properties for analysis. Refer to Properties of Binomial Logistic Regression.


Properties of Binomial Logistic Regression

The available properties of Binomial Logistic Regression are as shown in the figure given below.

The table given below describes the different fields present in the properties of Binomial Logistic Regression.

Field

Description

Remark

RunIt allows you to run the node.-
ExploreIt allows you to explore the successfully executed node.-
Vertical Ellipses

The available options are

  • Run till node
  • Run from node
  • Publish as a model
  • Publish code
-

Task Name

It displays the name of the selected task.

You can click the text field to edit or modify the name of the task as required.

Dependent Variable

It allows you to select the binary variable.

Any one of the available binary variables can be selected.

Independent Variable

It allows you to select the continuous variable.

  • Multiple variables can be selected.
  • To select the type of data scaling for a variable, hover over the name of the variable and click the gear icon.
  • You can select any one of the data scaling techniques.

Fit Intercept

It allows you select the fit intercept.

  • Fit Intercept is a Boolean parameter.
  • It has two values, True and False.
  • By default, the value is selected as True.

Method

It allows you to select the method (or solver).

  • It is the algorithm to be used in the optimization problem.
  • The available methods are
  • newton
  • lbfgs
  • powell
  • cg
  • ncg
  • basinhopping
  • minimize

Maximum Iteration

It allows you to select the maximum number of iterations to be performed on the data.

  • By default, the value is selected as 100.
  • You can select any value for the number of iterations to be performed. 

Dimensionality Reduction

  • It allows you to select the dimensionality reduction option.
  • There are two parameters available
  • None
  • Principal Component Analysis (PCA)
  • If you select PCA, another field is created for inserting the variance value.
  • Select a suitable value of variance.


Example of Binomial Logistic Regression

In the example given below, the Binomial Logistic Regression is applied on a Credit Card Balance dataset. The independent variables are Income, Limit, Cards, Age, and Balance. The Gender is selected as the dependent (binary) variable.

The figure given below displays the input data.


After using the Binomial Logistic Regression, the following results are displayed according to the Event of Interest, that is, either Male or Female.






Male: The Key Performance Index results obtained for the Event of Interest Male are given below.





The table given below describes the various parameters present on the Key Performance Index.

Field

Description

Remark

Sensitivity

It gives the ability of a test to correctly identify the positive results.
Sensitivity = TP / (TP + FN)
Where,
TP = number of true positives
FN = number of false negatives

  • It is also called as the True Positive Rate.
  • The value of sensitivity for Male is 0.285.

Specificity

  • It gives the ratio of the correctly classified negative samples to the total number of negative samples.

    Specificity = TN / (TN + FP)
    Where,
    TN = number of true negatives
    FP = number of false positives

    .
  • It is also called inverse recall.
  • The value of specificity for Male is 0.8164.

F-score

  • F-score is a measure of the accuracy of a test.
  • It is the harmonic mean of the precision and the recall of the test.

    F-score = 2 (precision × recall) / (precision + recall)
    Where,
    precision = positive predictive value, which is the proportion of the positive values that are positive.
    recall = sensitivity of a test, which is the ability of the test to correctly identify positive results to get the true positive rate.
  • It is also called the F-measure or F1 score.
  • The F-score for Male is 0.3846.

Accuracy

  • Accuracy is the ratio of the total number of correct predictions made by the model to the total predictions made.

    Accuracy = (TP + TN) / (TP + TN + FP + FN)
    Where,
    TP, TN, FP, and FN indicate True Positives, True Negatives, False Positives, and False Negatives respectively.
  • The Accuracy for Male is 0.56
Precision
  • Precision is the ratio of the True positive to the sum of True positive and False Positive. It represents positive predicted values by the model
  • The precision for male is 0.5914

The Confusion Matrix obtained for the Event of Interest Male is given below.

The Table given below describes the various values present in the Confusion Matrix.

Field

Description

Remark

Predicted

It gives the values that are predicted by the classification model.

The predicted values for Male and Female are

  • Male – 93
  • Female - 307

Actual

It gives the actual values from the result.

The actual values for Male and Female are

  • Male – 193
  • Female - 207

True Positive

It gives the number of results that are truly predicted to be positive.

The true positive count for Male is 169.

True Negative

It gives the number of results that are truly predicted to be negative.

The true negative count for Male is 55.

False Positive

It gives the number of results that are falsely predicted to be positive.

The false positive count for Male is 38.

False Negative

It gives the number of results that are falsely predicted to be negative.

The false negative count for Male is 138.


The Receiver Operating Characteristic (ROC) Chart for the Event of Interest Male is given below.









The Lift Chart obtained for the Event of Interest Male is given below.









The table given below describes the ROC Chart and the Lift Curve

Field

Description

Remark

ROC Chart

  • ROC curve is a probability curve that helps in the measurement of the performance of a classification model at various threshold settings.
  • ROC curve is plotted with True Positive Rate on the Y-axis and False Positive Rate on the X-axis.
  • We can use ROC curves to select possibly the most optimal models based on the class distribution.
  • The dotted line is the random choice with probability equal to 50%, Area Under Curve (AUC) equal to 0.5, and the slope equal to 1.
  • In the above graph, the ROC curve is very close to the dotted line.

Lift Curve

  • A lift is the measure of the effectiveness of a model.
  • It is the ratio of the percentage gain to the percentage of random expectation at a given decile level.
  • It is the ratio of the result obtained with a predictive model to that obtained without it.
  • A lift chart contains a lift curve and a baseline.
  • It is expected that the curve should go as high as possible towards the top-left corner of the graph.
  • Greater the area between the lift curve and the baseline, better is the model.
  • In the above graph, the lift curve remains above the baseline up top 30% of the records and then becomes parallel to the baseline.

The results obtained for the binary variable Female are given below.


    • Related Articles

    • MLP Neural Network in Regression

      The MLP Neural Network is located under Machine Learning in Regression, on the left task pane. Alternatively, use the search bar for finding the MLP Neural Network algorithm. Use the drag-and-drop method or double-click to use the algorithm in the ...
    • Poisson Regression

      Poisson Regression is located under Machine Learning () under Regression, in the left task pane. Use the drag-and-drop method to use the algorithm in the canvas. Click the algorithm to view and select different properties for analysis. Refer to ...
    • Linear Regression

      Linear Regression is located under Machine Learning ( ) in Regression, in the task pane on the left. Use the drag-and-drop method to use the algorithm in the canvas. Click the algorithm to view and select different properties for analysis. Refer to ...
    • Polynomial Regression

      Polynomial Regression is located under Machine Learning () under Regression, in the left task pane. Use the drag-and-drop method to use the algorithm in the canvas. Click the algorithm to view and select different properties for analysis. Refer to ...
    • Ridge Regression

      Ridge Regression is located under Machine Leaning ( ) under Regression, in the left task pane. Use the drag-and-drop method to use the algorithm in the canvas. Click the algorithm to view and select different properties for analysis. Refer to ...