Theory Guide | Pre Processing

Pre Processing
Word Embedding
Word Embedding Description Word Embedding is a form of word representation that bridges the human understanding of language to that of a machine. These are found to be useful representations of words and lead to better performance in the various ...
Word Frequency
Word Frequency Description Word frequency is the number of occurrences of a word in a given text. Why to use Textual Analysis – Pre Processing When to use When you want to find the frequency of a word, that is, the number of times a particular word ...
Stemmer
Stemmer Description The automated process produces a base string in an attempt to represent related words. For example, if the words are "runs", "running", and "runner", then the algorithm will automatically reduce these words to the root word "run". ...
Spelling Corrector
Spelling Corrector Description Spelling Corrector enables you to correct the most cumbersome mistakes, with a high degree of accuracy and speed. Why to use Textual Analysis – Pre Processing When to use When there are words present in the textual data ...
Punctuation Remover
Punctuation Remover Description Punctuation remover is an algorithm used to remove punctuation marks like a full stop, comma, semi-colon, question mark, exclamatory mark, and other such punctuation marks from the given text. Why to use Textual ...
Lemmatizer
Lemmatizer Description Lemmatization usually refers to doing things properly with the use of a vocabulary and morphological analysis of words, normally aiming to remove inflectional endings only and to return the base or dictionary form of a word. ...
Frequent Words Remover
Frequent Words Remover Description Frequent words remover eliminates the frequent word/words before further processing. These words are called as Stop words. Why to use Textual Analysis – Pre Processing When to use When you want to remove stop words ...
Custom Words Remover
Custom Words Remover Description Custom words remover eliminates the user-specified custom word/words before further processing. Why to use Textual Analysis – Pre Processing When to use When user defined custom words are to be removed from the ...
Case Convertor
Case Convertor Description Case convertor is used to adjust the capitalization in a textual document. It can alter the case of an alphabet to a lower case or an upper case. It is used for preprocessing textual data only. Why to use Textual Analysis – ...

Popular Articles
Getting Started
Rubiscape is an innovative data science platform which is a one-stop solution to all your data analysis and forecasting requirements. Whatever be the stage of your data analytical cycle, the Rubiscape platform surely has a product to fulfill your ...
Writing Algorithm Result
The Rubiscape process for analyzing data is - read the data, process it based on the algorithms selected, and display the result. In this process, Rubiscape stores your data only in the temporary cache. After the result is displayed, the resultant ...
Sequence Generator
Sequence Generator Description Sequence Generator adds a sequence column to your dataset. Why to use To add Surrogate Keys, Primary Keys to the dataset. When to use When you want to add a sequence column to your dataset. When not to use — ...
Keyboard Shortcuts in Dashboard
Keyboard shortcuts are helpful for enhancing user efficiency. Rubiscape provides you with various shortcut keys to move around the RubiSight dashboard and perform tasks using keyboards. The table below describes the shortcuts available in rubiscape ...
Using Filters
When you plot a chart, all the data in the dataset is not required to be used. Also, within the data used, there might be sub-categories that you want to plot separately. You can view classified results in the charts using filters. Filters help you ...

Theory Guide | Pre Processing | Knowledge Base

Pre Processing

Word Embedding

Word Frequency

Stemmer

Spelling Corrector

Punctuation Remover

Lemmatizer

Frequent Words Remover

Custom Words Remover

Case Convertor

Popular Articles

Getting Started

Writing Algorithm Result

Sequence Generator

Keyboard Shortcuts in Dashboard

Using Filters