Binomial Logistic Regression
Binomial Logistic Regression is located under Machine Learning () in Data Classification, in the task pane on the left. Use drag-and-drop method to use the algorithm in the canvas. Click the algorithm to view and select different properties for analysis. Refer to Properties of Binomial Logistic Regression.
Properties of Binomial Logistic Regression
The available properties of Binomial Logistic Regression are as shown in the figure given below.
The table given below describes the different fields present in the properties of Binomial Logistic Regression.
Field | Description | Remark |
---|
Run | It allows you to run the node. | - |
Explore | It allows you to explore the successfully executed node. | - |
Vertical Ellipses | The available options are - Run till node
- Run from node
- Publish as a model
- Publish code
| - |
Task Name | It displays the name of the selected task. | You can click the text field to edit or modify the name of the task as required. |
Dependent Variable | It allows you to select the binary variable. | Any one of the available binary variables can be selected. |
Independent Variable | It allows you to select the continuous variable. | - Multiple variables can be selected.
- To select the type of data scaling for a variable, hover over the name of the variable and click the gear icon.
- You can select any one of the data scaling techniques.
|
Fit Intercept | It allows you select the fit intercept. | - Fit Intercept is a Boolean parameter.
- It has two values, True and False.
- By default, the value is selected as True.
|
Method | It allows you to select the method (or solver). | - It is the algorithm to be used in the optimization problem.
- The available methods are
- newton
- lbfgs
- powell
- cg
- ncg
- basinhopping
- minimize
|
Maximum Iteration | It allows you to select the maximum number of iterations to be performed on the data. | - By default, the value is selected as 100.
- You can select any value for the number of iterations to be performed.
|
Dimensionality Reduction | - It allows you to select the dimensionality reduction option.
- There are two parameters available
- None
- Principal Component Analysis (PCA)
| - If you select PCA, another field is created for inserting the variance value.
- Select a suitable value of variance.
|
Example of Binomial Logistic Regression
In the example given below, the Binomial Logistic Regression is applied on a Credit Card Balance dataset. The independent variables are Income, Limit, Cards, Age, and Balance. The Gender is selected as the dependent (binary) variable.
The figure given below displays the input data.
After using the Binomial Logistic Regression, the following results are displayed according to the Event of Interest, that is, either Male or Female.
Male: The Key Performance Index results obtained for the Event of Interest Male are given below.
The table given below describes the various parameters present on the Key Performance Index.
Field | Description | Remark |
---|
Sensitivity | It gives the ability of a test to correctly identify the positive results. Sensitivity = TP / (TP + FN) Where, TP = number of true positives FN = number of false negatives | - It is also called as the True Positive Rate.
- The value of sensitivity for Male is 0.285.
|
Specificity | - It gives the ratio of the correctly classified negative samples to the total number of negative samples.
Specificity = TN / (TN + FP) Where, TN = number of true negatives FP = number of false positives
.
| - It is also called inverse recall.
- The value of specificity for Male is 0.8164.
|
F-score | - F-score is a measure of the accuracy of a test.
- It is the harmonic mean of the precision and the recall of the test.
F-score = 2 (precision × recall) / (precision + recall) Where, precision = positive predictive value, which is the proportion of the positive values that are positive. recall = sensitivity of a test, which is the ability of the test to correctly identify positive results to get the true positive rate.
| - It is also called the F-measure or F1 score.
- The F-score for Male is 0.3846.
|
Accuracy | - Accuracy is the ratio of the total number of correct predictions made by the model to the total predictions made.
Accuracy = (TP + TN) / (TP + TN + FP + FN) Where, TP, TN, FP, and FN indicate True Positives, True Negatives, False Positives, and False Negatives respectively.
| - The Accuracy for Male is 0.56
|
Precision | - Precision is the ratio of the True positive to the sum of True positive and False Positive. It represents positive predicted values by the model
| - The precision for male is 0.5914
|
The Confusion Matrix obtained for the Event of Interest Male is given below.
The Table given below describes the various values present in the Confusion Matrix.
Field | Description | Remark |
---|
Predicted | It gives the values that are predicted by the classification model. | The predicted values for Male and Female are |
Actual | It gives the actual values from the result. | The actual values for Male and Female are |
True Positive | It gives the number of results that are truly predicted to be positive. | The true positive count for Male is 169. |
True Negative | It gives the number of results that are truly predicted to be negative. | The true negative count for Male is 55. |
False Positive | It gives the number of results that are falsely predicted to be positive. | The false positive count for Male is 38. |
False Negative | It gives the number of results that are falsely predicted to be negative. | The false negative count for Male is 138. |
The Receiver Operating Characteristic (ROC) Chart for the Event of Interest Male is given below.
The Lift Chart obtained for the Event of Interest Male is given below.
The table given below describes the ROC Chart and the Lift Curve
Field | Description | Remark |
---|
ROC Chart | - ROC curve is a probability curve that helps in the measurement of the performance of a classification model at various threshold settings.
| - ROC curve is plotted with True Positive Rate on the Y-axis and False Positive Rate on the X-axis.
- We can use ROC curves to select possibly the most optimal models based on the class distribution.
- The dotted line is the random choice with probability equal to 50%, Area Under Curve (AUC) equal to 0.5, and the slope equal to 1.
- In the above graph, the ROC curve is very close to the dotted line.
|
Lift Curve | - A lift is the measure of the effectiveness of a model.
- It is the ratio of the percentage gain to the percentage of random expectation at a given decile level.
- It is the ratio of the result obtained with a predictive model to that obtained without it.
| - A lift chart contains a lift curve and a baseline.
- It is expected that the curve should go as high as possible towards the top-left corner of the graph.
- Greater the area between the lift curve and the baseline, better is the model.
- In the above graph, the lift curve remains above the baseline up top 30% of the records and then becomes parallel to the baseline.
|
The results obtained for the binary variable Female are given below.
Related Articles
MLP Neural Network in Regression
The MLP Neural Network is located under Machine Learning in Regression, on the left task pane. Alternatively, use the search bar for finding the MLP Neural Network algorithm. Use the drag-and-drop method or double-click to use the algorithm in the ...
Poisson Regression
Poisson Regression is located under Machine Learning () under Regression, in the left task pane. Use the drag-and-drop method to use the algorithm in the canvas. Click the algorithm to view and select different properties for analysis. Refer to ...
Linear Regression
Linear Regression is located under Machine Learning ( ) in Regression, in the task pane on the left. Use the drag-and-drop method to use the algorithm in the canvas. Click the algorithm to view and select different properties for analysis. Refer to ...
Polynomial Regression
Polynomial Regression is located under Machine Learning () under Regression, in the left task pane. Use the drag-and-drop method to use the algorithm in the canvas. Click the algorithm to view and select different properties for analysis. Refer to ...
Ridge Regression
Ridge Regression is located under Machine Leaning ( ) under Regression, in the left task pane. Use the drag-and-drop method to use the algorithm in the canvas. Click the algorithm to view and select different properties for analysis. Refer to ...