Expression is located under Model Studio ( ) in Data Preparation, in the Task Pane on the left. Use the drag-and-drop method to use the feature in the canvas. Click the feature to view and select different properties for analysis.
Refer to Properties of Expression.
With the Expression function, you can generate new features by modifying a feature or combining two or more existing features from the dataset using arithmetic, logic, text, date, and other such operators.
The available properties of the Expression are as shown in the figure given below.
The table given below describes the different fields present on the Properties pane of the Expression.
Field | Description | Remark | |
Run | It allows you to run the node. | - | |
Explore | It allows you to explore the successfully executed node. | - | |
Vertical Ellipses | The available options are
| - | |
Task Name | It is the name of the task selected on the workbook canvas. | You can click the text field to edit or modify the name of the task as required. | |
Features | It lists the newly created features. | — | |
Add Expression Feature | It allows you to create a new feature. |
| |
Advanced | Node Configuration | It allows you to select the instance of the AWS server to provide control on the execution of a task in a workbook or workflow. | For more details, refer to Worker Node Configuration. |
The Expression function helps you to create new features in your dataset. To create a new feature, follow the steps given below.
The Feature Definition page is displayed as shown in the figure below.
The table given below describes the different fields present on the Feature Definition page.
Field | Description | Remark |
Feature name | It allows you to enter the name of the newly created feature. | — |
Variable Type | It allows you to select the variable type of the feature. | The available options are,
|
Data type | It displays the data type of the feature based on the selected variable type. | — |
Bypass Validation | Bypasses expression validation. | — |
Expression Builder | It allows you to build the expression by adding the Operators, Features, and Constants on the canvas. | For building an expression, refer to Building an Expression. |
It deletes the built expressions and closes the Feature Definition page. | — | |
It validates the expression based on the selected features, operators, and the variable type of the newly created feature. | — | |
It adds a valid newly created expression to the Features list in the Properties pane of the Expression node. | — |
Build your expression. Refer to Building an Expression.
Note: | Alternatively, you can use the Code Editor to build your expression. To build your expression using Code Editor, click Switch to Code Editor in the Expression Builder canvas top-right corner and type your code in Python language. |
The newly created Feature is added in the Features list in the Properties pane, as shown in the figure below.
5. Run the Expression.
After successful execution, the feature is added to the output of the Expression node. You can explore the Expression node to view the newly added feature.
Note: | To add a constant to the canvas, click Constant from Functions in step 2 and select the required constant. |
A sample expression is shown in the figure below.
The Expression Builder on the Feature Definition page has the elements listed below -
The different types of operators are explained in the sections below.
Operator | Code Editor | Syntax/Description | Example |
| Equals to - Element1 == Element2 | [Quantity] == 100 | |
| Not equal to - [Element1] != Element2 | [Quantity] != 0 It returns True when the value of the Quantity feature is not 0; otherwise, it returns False. | |
| Less than - [Element1] < Element2 | [Quantity] < 50 It returns True when the value of the Quantity feature is less than 50; otherwise, it returns False. | |
| Less than equal to - [Element1] <= Element2 It returns True if the value of Element1 is less than or equal to Element2. | [Quantity] <= 50 It returns True when the value of the Quantity feature is less than or equal to 50; otherwise, it returns False. | |
| Greater than - [Element1] > Element2 It returns true if the value of Feature is greater than Element2. | [Quantity] > 35 It returns true when the value of the Quantity feature is greater than 35; otherwise, it returns False. | |
| Greater than equal to - [Element1] >= Element2 It returns true if the value of Element1 is greater than or equal to Element2. | [Quantity] >= 16.35 It returns true when the value of the Quantity feature is greater than or equal to 16.35; otherwise, it returns False. |
Note: | The Element1 and Element2 mentioned in the above table can be Feature, Constant, or another Expression that evaluates to a Numerical value. |
Operator | Code Editor | Syntax/Description | Example/Remark |
| True – It is a Boolean value used to filter the data or to control the flow of execution. | ['Employed'] == True] | |
False | False – It is a Boolean value used to filter the data or to control the flow of execution. | ['Employed'] == False] | |
| Text – It is used to add a string to the expression. | Example : 'Age' or '12" | |
| List – It is used to create a list of items. You can connect other features/expressions to create a list. | [[Age], [Gender, [Education]] In this example, a list is created using three features – Age, Gender, and Education. | |
| Float – It is used to add a float value to the expression. | 123.01 | |
| Number – It is used to add an integer value to the expression. | 123 | |
| Date – It is used to add a date to the expression. | The date format is yyyy-mm-dd. Example: pd.datetime(2023,12,31) | |
{ } | Json - It allows you to store, query, and manipulate the data. | Example: {"name": "Ram", "age": 25} | |
| Key Value Pair - | Example: 'Education' : 'Engineer' |
Operator | Code Editor | Syntax/Description | Example/Remark |
| To BOOLEAN(Element) | Example: bool([Owner]) | |
| To NUMERIC(Element) | Example: float(123) | |
| To INT (Element) | int(12.5) | |
| To STR(Element) It converts the Element to a string value. | str(Rubiscape) | |
| To Date(Element) | Example: datetime.strptime([Owner], "2-3-1998") You can provide a Python date format for conversion. |
Note: | The Element1 mentioned in the above table can be either a Feature, a Constant, or an Expression. |
The different types of Logical operators are explained in the table below.
Operator | Code Editor | Syntax/Description | Example/Remark |
| Equals It returns True if Element1 is equal to Element2. | Example: [Price] == '1000' | |
| Xor It returns True if the values of Element1 and Element2 are not the same. | — | |
| Or It returns True if either Element1 or Element2 evaluates to True. | Example: [Color] or [Neck Style] | |
| And It returns True if Element1 and Element2 both evaluate to True. | Example: [Color] and [Neck Style] | |
| NOT(Feature) | Any value which is not 0 or False is considered as True. Example: not([Color]) |
Note: | The Element1 and Element2 mentioned in the above table can be Feature, Constant, or another Expression. |
The different types of Mathematical operators are explained in the table below.
Operator | Code Editor | Syntax/Description | Example |
| Addition - It is used to add,
|
| |
| Subtraction - [Element1] – [Element2] It is used to subtract,
|
| |
| Multiplication [Element1] * [Element2] It is used to multiply
|
| |
| Division [Element1] / [Element2]
|
| |
| Sin(Element) It returns the sine value of the Element. | Example: math.sin(20) | |
| Cos(Element) It returns the cosine value of the Element. | Example: math.cos(20) | |
| Tan(Element) It returns the tangent of the Element. | Example: math.tan(20) | |
| Floor(Element) It returns the floor value of the Element. | The floor value of a number is the closest integer less than or equal to the given number. Example: math.floor(20) | |
| CEIL(Element) It returns the ceiling value of the Element. | The ceiling value of a number is the closest integer greater than or equal to the given number. Example: math.ceil(100) | |
| ABS(Element) It returns the absolute value of the Element. | Example: abs([Qty]) | |
| Mod(Element1)( Element2) It returns the remainder of the division Element1/Element2 | Example: [Cost]%[Qty] | |
| Power(Element1)( Element2) It returns Element1 to the power of Element2 | Example: math.pow([Price],[Qty]) | |
| It returns the value of the mathematical constant pi. | The value is 3.141593. Example: [Price] * math.pi | |
| NATURAL LOG([Element]) It is used to calculate the natural logarithm (logarithm to the base 10) of an Element. | NATURAL LOG ([Quantity]) If the value of Quantity is 5, this expression evaluates as log10(5). Example: log([Price]) | |
| EXPONENTIAL([Element]) It is used to raise the value of e (2.7183) to the power of Element. | EXPONENTIAL ([Quantity]) If the value of Quantity is 5, this expression evaluates as exp(5) and e5 = (2.7183)5 is calculated. | |
| RECIPROCAL([Element]) It is used to calculate the reciprocal (power of –1) of Element. | RECIPROCAL (Quantity) if the value of Quantity is 5, this expression evaluates to 1/5. | |
| RECIPROCAL SQUARE([Element]) It is used to calculate the reciprocal of the square of Feature. | Example: 1/5**2 if the value of Quantity is 5, this expression evaluates to 1/(52). | |
| ([Element])**2 It is used to calculate the square of the Element. | If the value of Quantity is 5, this expression evaluates to (5)2. | |
| sqrt([Element]) It is used to calculate the square root of the Element. | Example: sqrt(5) If the value of Quantity is 5, this expression evaluates to √5. |
Note: | The Element1 and Element2 mentioned in the above table can be Feature, Constant, or another Expression that evaluates to a Numerical value. |
Operator | Code Editor | Syntax/Description | Example/Remark |
Agg([Column_Names], aggregationMethod = "count") |
| It used to get the aggregated value of the selected feature. Available aggregation methos are
| Example: Agg([Price], aggregationMethod = "max") |
groupByExclude([], measureName = [Column_Names], aggregationMethod = "count") |
| It exclude the selected dimension(s) in the view. Available aggregation methos are
| Example: groupByExclude([Size], measureName = [Qty], aggregationMethod = "min") |
groupByInclude([], measureName = [Column_Names], aggregationMethod = "count") |
| It includes the selected dimension(s) in the view. Available aggregation methos are
| Example: groupByInclude([Size], measureName = [Price], aggregationMethod = "max") |
groupByFixed([], measureName = [Column_Names], aggregationMethod = "count") |
| It fixes the selected dimension in the view. Available aggregation methos are
| Example: groupByFixed([Size], measureName = [Cost], aggregationMethod = "nunique") |
The different types of Date operators are explained in the table below.
Operator | Code Editor | Syntax/Description | Example/Remark |
| Date(Element1, Element2, Element3) It creates a date object with the given values. Element1– Year Element2 – Month Element3 – Date | Example: pd.datetime(2023,12,31) | |
| DateAdd(Element1, Time Component, Element2) It creates a date feature by adding Element2 to the Time Component of Element1. The newly created feature is of date type. |
Example: [Order Date]+ dateutil.relativedelta.relativedelta(weeks=25) | |
| DateDiff(Element1, Time Component, Element2) It creates a date feature by subtracting the Element2 to the Time Component of the Element1. The newly created feature is of date type. |
Example: relativedelta([Order Date], [Ship Date]).years | |
| DateTrunc(Element, Time Component) It creates a date feature by extracting the Time Component from the Element. The rest of the components are given default values. The newly created feature is of date type. |
Example: [Order Date].replace(day=1, hour=0, minute=0, second=0) | |
| Day(Element) It returns the Day component of the Element. The newly created feature is of integer type. | Example: Day(2021-08-29 12:30:45) It returns value 29. | |
| Hour(Element) It returns the Hour component of the Element. The newly created feature is of integer type. | Example: Hour(2021-08-29 12:30:45) It returns value 12. | |
| Minute(Element) It returns the Day component of the Element. The newly created feature is of integer type. | Example: Minute(2021-08-29 12:30:45) It returns value 30. | |
| Now It returns the current date and time in the UTC timezone. The newly created feature is of date type. | — | |
| Month(Element) It returns the Month component of the Element. The newly created feature is of integer type. | Example: Month(2021-08-29 12:30:45) It returns value 08. | |
| Quarter(Element) It returns the Quarter of the year of the date in the Element. The newly created feature is of integer type. | Example: Quarter(2021-08-29 12:30:45) It returns the value 3 since the date is from the third quarter of the year. | |
| Second(Element) It returns the Second component of the Element. The newly created feature is of integer type. | Example: Second(2021-08-29 12:30:45) It returns value 45. | |
| Week(Element) It returns the week of the year of the date in the Element. The newly created feature is of integer type. | Example: Week(2021-08-29 12:30:45) It returns the value 34 since the date is from the 34th week of the year. | |
| Weekday(Element) It returns the numerical form of the day of the week of the Element date. It returns the name of the day (Monday, Tuesday, and so on) from the Element. It takes a date column as input. | Example: weekday([Order Date]) | |
| Year(Element) It returns the year of the Element date. The newly created feature is of integer type. | Example: Year(2021-08-29 12:30:45) It returns value 2021. |
Note: | The Element1, Element2, Element3, and Element mentioned in the above table can be either Feature, Constant, or another Expression that evaluates to an Interval value. |
Operator | Code Editor | Syntax/Description | Example/Remark |
| Begins with It returns True if the string value of the Element begins with a character in Value. |
| |
| Ends with It returns true if the string value of Element ends with character in Value. |
| |
| Compare It returns true if the string in Element is the same as the string in Value. | This operation is case-sensitive. | |
| Concatination It concatenates Element with Value1 and Value2 | Concat(“Monthly”, “ “, “Income) It returns the string as Monthly Income. | |
| Contains It returns True if the Element string contains a string in Value. |
Example: contains([Country Name],'India') | |
| Find It finds the position of a substring in a text. If the substring is found at the location specified by the start index and end index, it returns the position of the text; otherwise, it returns -1. |
Example: 'India'.find('in',0,2) | |
| Length It returns the length of the string in Feature. | Length(“Monthly Income”) It returns the length of the string “Monthly Income” - 14. Example: len([City]) | |
| Replace It replaces a string with a user-specified string. | Replace(City, Pune, PUN, 1) It replaces the rows in the City feature containing value Pune with value PUN. Example: [City].replace('Pune','PUN',1) | |
| Lower It converts the Element to lowercase. | Lower(“Monthly”) It returns the value “monthly”. Example: [Region].lower() | |
| Upper It converts the Element to uppercase. | Upper(“Monthly”) It returns the value “MONTHLY”. Example: [Region].upper() | |
| Left It returns the first number of characters denoted by Number from Element. | Left(“Algorithm”, 4) It returns “Algo”. Example: left([City],4) | |
| Right It returns the last number of characters denoted by Number from Element. | Right(“Algorithm”, 4) It returns “ithm”. Example: right([Element1],Value1) | |
| Like It checks if the Value is present in the Element using a wildcard match. It returns True if the value is present; otherwise, it returns False. | Like(India, “In”) It returns True. Example: contains([City],'Wi') | |
| Not like It checks if the Value is present in the Element using a wildcard match. It returns True if the value is not present; otherwise, it returns False. | Not Like(India, “In”) It returns False. Example: notLike([City],'Wi') | |
| Trim It trims the white spaces at the beginning and end of the Element. | Trim(“ Monthly “) It returns the value as “Monthly”. Example: [City].strip() | |
| Ltrim It trims the white spaces at the beginning of the Element. | LTrim(“ Monthly “) It returns the value as “Monthly ”. Example: [City].lstrip() | |
| Rtrim It trims the white spaces at the end of the Element. | RTrim(“ Monthly “) It returns the value as “ Monthly”. Example: [City].rstrip() | |
| Substring It returns a substring of Element starting from Position containing Number of Letters. | Substring(“Monthly”, 1, 2) It returns the value as “on”. Example: substring([City],start_point = 1,numberOfChars = 2) |
Note: | The Element1 and Element2 mentioned in the above table can be Feature, Constant, or another Expression that evaluates to a Textual value. |
For Dynamic Calculations Types, refer to Dynamic Calculations.
This block contains the Features available in the dataset. You can drag-drop it on the canvas to include the features. Then, use the drop-down arrow to select from the list of available features.
The image below displays an example feature block with a list of features available in a sample dataset.
Consider an HR Dataset with the EmployeeNumber, MonthlyRate, JobInvolvement, BusinessTravel, and a few other features. The input data is shown in the figure below.