Chi Square Test

Chi Square Test

The Chi-Square test is located under Model Studio > Statistical analysis > Hypothesis Test > Non Parametric Test.
Alternatively, use the search bar to find the Chi-Square test feature. Use the drag-and-drop method or double-click to use the algorithm in the canvas. Click the algorithm to view and select different properties for analysis.


Properties of Chi Square Test

The available properties of the Chi Square test are shown below.


The table below describes the different properties of the Chi Square Test.

Field

Description

Remark

Run

It allows you to run the node.

Explore

It allows you to explore the successfully executed node.

Vertical Ellipses

The available options are

  • Run till node
  • Run from node
  • Publish as a model
  • Publish code
-

Task Name

It is the name of the task selected on the workbook canvas.

  • You can click the text field to edit or modify the task's name.
  • Space between words is not allowed in the Task Name.

Response

It allows you to select one categorical variable from Dataset.

  • Only categorical columns are displayed in this dropdown.
  • Pick up one column from the drop-down.
  • Please make sure the selected categorical variable has only two categories.

Independent Variable

It allows you to select one category from the dataset.

  • All data columns are displayed in this dropdown.
  • Pick up one column from the drop-down

Level of Significance

It allows you to set the level of significance.
  • The default value is 0.05. You are allowed to modify this value.
  • The Alpha value must be between 0 to 1. It cannot be 0 and 1.

Add result as a variable

It allows you to use the result in the variables

For more details, refer to Adding Result as a Variable.

Node Configuration

It allows you to select the instance of the AWS server to provide control over the execution of a task in a workbook or workflow.

For more details, refer to Worker Node Configuration.

Example of Chi Square Test

Consider a company employee's data with department, gender, age, and other personal data. As an HR manager, you want to find out whether the Gender distribution in each department is equal or not.

An input data snippet is displayed below.


We apply Chi-Square to the input data by selecting two independent columns. The chosen values are given below.

Property

Value

Task Name

Chi_Square_Test

Response

Gender

Independent Variable

Department

Alpha

0.05

The result page consists of the following sections.

Frequency Table


The frequency table displays the Response variable values in the row. Independent variable values in the column.
Observed Frequency and Expected Frequency are calculated for each value. Observed Frequency is the number of occurrences found in the sample. The Expected Frequency is calculated as
Expected Frequency = ((Row Total) * (Column Total)) / Total Number of Observations
Computation Table


The computation Table displays the Test Statistics.
  • Independent Variable – in this example department
  • Response – Gender
  • Observed Frequency (O) – the number of occurrences for the gender, of the department.
  • Expected Frequency (E) – calculated as ((Row Total) * (Column Total)) / Total Number of Observations
  • Observed Frequency – Expected Frequency (O – E)
  • (O – E) ^2
  • (O – E) ^2 / E

Hypothesis Interpretation


It displays following
  • Null Hypothesis
  • Alternative Hypothesis
  • Result Value – it consists of Alpha value, p value, Critical value read from Chi-square table, Calculated value for the sample
  • Interpretation – Compare the p value with the Alpha value. In this example, the p value is equal to the Alpha value hence the hypothesis is rejected.



    • Related Articles

    • Chi Square Test for Independence

      Chi Square Test for Independence is located under Model Studio ( ) in Hypothesis Test, in Statistical Analysis, in the left task pane. Use the drag-and-drop method to use the algorithm in the canvas. Click the algorithm to view and select different ...
    • Chi Square Goodness of Fit Test

      Chi Square Goodness of Fit Test is located under Model Studio ( ) in Hypothesis Test, in Statistical Analysis, in the left task pane. Use the drag-and-drop method to use the algorithm in the canvas. Click the algorithm to view and select different ...
    • Train Test Split

      Train Test Split is located under Model Studio () under Sampling in Data Preparation, in the left task pane . Use the drag-and-drop method to use the algorithm in the canvas. Click the algorithm to view and select different properties for analysis. ...
    • Train Test Split

      Train Test Split is located under Model Studio () under Sampling in Data Preparation, in the left task pane . Use the drag-and-drop method to use the algorithm in the canvas. Click the algorithm to view and select different properties for analysis. ...
    • Train Test Split

      Train Test Split is located under Forecasting ( ) in Data Preparation, in the left task pane. Use the drag-and-drop method to use the algorithm in the canvas. Click the algorithm to view and select different properties for analysis. Refer to ...