Regular Expression

Regular Expression

Operators in Expression Builder

The Expression Builder on the Feature Definition page has the elements listed below -

  • Features
  • Variables
  • Functions


Regular Expression Operators

The Regular Expression operator is explained in the table below.

Operator

Code Editor

Syntax/Description

Example/Remark


re.sub(0,0,0,
flags=re.IGNORECASE)

Replace(String1, String2, Feature) (Boolean Value)

  • String1 = string to be replaced
  • String2 = string that replaces String1
  • Feature = categorical column on which the replace function is applied.
  • Flagsre.IGNORECASE
  • It replaces one string with another.
  • This operation is case-sensitive.
  • If Boolean Value is True, the case sensitivity is ignored.
  • Thus, String1 will be replaced even if it does not have identical casing as that in the categorical column.
  • For example, String1 is 'Bitter', and String2 is 'Sweet'. Then, even if the values in the column are 'Bitter' or 'bitter', both will be replaced by 'Sweet'.
  • If Boolean Value is re.IGNORECASE, the case sensitivity is ignored.
  • Thus, String1 will be replaced only if it has an identical casing as in the categorical column.
  • For example, String1 is 'Bitter', and String2 is 'Sweet'. Then, only the values 'Bitter' will be replaced by 'Sweet'. Any other string like 'bitter' will remain unchanged.

Example of Regular Expression

Consider a Dataset containing a Species column with 150 values, and six in uppercase (Setosa).
The input data is shown in the figure below.


We create an expression shown below. According to the expression, we want to replace the word Setosa with Flora in the Species column. The Boolean value is True, which indicates that case sensitivity is ignored.


The result of the Expression node is displayed below. You can see that both types of values (setosa and Setosa ) are replaced with the new string Flora.


We remove the Flag since it is optional.


The result of the Expression node is displayed below. You can see that only the uppercase values (Setosa) are replaced with Flora. The lowercase value (setosa) remains unchanged.



    • Related Articles

    • Expression

      Expression is located under Model Studio ( ) in Data Preparation, in the Task Pane on the left. Use the drag-and-drop method to use the feature in the canvas. Click the feature to view and select different properties for analysis. Refer to Properties ...
    • Expression

      Expression is located under Model Studio ( ) in Data Preparation, in the Task Pane on the left. Use the drag-and-drop method to use the feature in the canvas. Click the feature to view and select different properties for analysis. Refer to Properties ...
    • Add Cache Calculated Column

      Rubiscape allows you to create a Cache Calculated Column in the dashboard. It has similar functionality as the Calculated Column. While creating a Cache Calculated Column, a cache file is generated. After every update in the source file or calculated ...
    • Calculated Columns in RubiSight

      RubiSight provides a function to add a new column that is not originally present in your dataset. You can create a new column by doing some calculations on the existing columns. RubiSight uses the Expression function for creating a new column using ...
    • Adding Calculated Column to Dashboard

      To add a calculated column to your dashboard, follow the steps given below. Open the Dashboard in edit mode. Refer to Editing a Dashboard. The dashboard is displayed. Add a dataset to the dashboard. Refer to Adding Dataset in RubiSight. In the DATA ...