Word Embedding is located under Textual Analysis in Pre processing on the left task pane. Alternatively, use the search bar for finding the Word Embedding feature. Use the drag-and-drop method or double-click to use the algorithm in the canvas. Click the algorithm to view and select different properties for analysis.
Properties of Word Embedding
The available properties of Word Embedding are shown below.
The table below describes the different properties of the Mann Whitney U Test.
Field | Description | Remark |
---|
Run | It allows you to run the node. | - |
Explore | It allows you to explore the successfully executed node. | - |
Vertical Ellipses | The available options are - Run till node
- Run from node
- Publish as a model
- Publish code
| - |
Task Name
| It is the name of the task selected on the workbook canvas. | - You can click the text field to edit or modify the task's name.
- Space between words is not allowed in the Task Name.
|
Text
| It allows you to select one textual column. | - You need to select textual data.
- All data columns are displayed in this dropdown.
- However if you select a non-textual data column, the algorithm will not work.
|
Advanced | Stop words | A dictionary of words that are restricted to use. | - The default value is none.
- You can choose multiple words. Each separated by a ','
- Spaces between words are allowed.
|
| Dimension | It represents the total number of features that are encoded in the word embedding.
| - Only integer type values are allowed.
- You can choose any natural number starting from 1.
- The selection of features is mostly done automatically in the training process by using hidden layer.
|
Node Configuration | It allows you to select the instance of the AWS server to provide control over the execution of a task in a workbook or workflow. | For more details, refer to Worker Node Configuration. |
Example of Word embedding
An employee is given the task of converting a specific set of words into machine readable vector form while maintaining that the words that have similar meaning are placed in close spatial capacity. He uses Word Embedding to achieve this.
Below is a snippet of the output data-
Further, the Result page is as follows.
The result page consists of the following sections:
- Index:
This section displays the index.
- Word:
This section displays the word categories available in the dataset.
- Vector Norm:
- These are the metrics created by using the stop words and the words that are available in the selected data column. They give the vector form for the categories in the dataset.
- The higher the word frequency, the larger is the norm of this word embedding.
- If the words are used in different contexts, the norm of the vector decreases.
Related Articles
Word Embeding
Word Embedding is located under Textual Analysis in Pre processing on the left task pane. Alternatively, use the search bar for finding the Word Embedding feature. Use the drag-and-drop method or double-click to use the algorithm in the canvas. Click ...
Word Correlation
Word Correlation is located under Textual Analysis in Pre Processing, in the task pane on the left. Use the drag-and-drop method to use the algorithm in the canvas. Click the algorithm to view and select different properties for analysis. Refer to ...
Word Correlation
Word Correlation is located under Textual Analysis in Pre Processing, in the task pane on the left. Use the drag-and-drop method to use the algorithm in the canvas. Click the algorithm to view and select different properties for analysis. Refer to ...
Custom Word Remover
Custom Words Remover is located under Textual Analysis in Pre Processing, in the task pane on the left. Use drag-and-drop method to use algorithm in the canvas. Click the algorithm to view and select different properties for analysis. One of the ...
Custom Word Remover
Custom Words Remover is located under Textual Analysis in Pre Processing, in the task pane on the left. Use drag-and-drop method to use algorithm in the canvas. Click the algorithm to view and select different properties for analysis. One of the ...