Data Compare | ||||
Description | The Data Compare Task is used to find and highlight differences between two datasets in a simple and efficient way. It helps users compare numeric values, identify changes or mismatches, and ensure data consistency across datasets. | |||
Why to use | For Data Preparation | |||
When to use | When you want to compare two or more datasets. | When not to use | — | |
Prerequisites | The data must have at least one common column to select as a key column and at least one common numeric column to select as the column to compare. | |||
Related Algorithms | Data Compare | Alternative Algorithms | — | |
Input | Two or more datasets | Output | Single dataset with a new column that flags each row as either "M" for a match when there is no difference, or "D" for a difference when a non-zero difference is found. Along with this, the output also shows the numeric value of the difference and the percentage difference. | |
Statistical Methods used | — | Limitations | — |