Pre Processing

Pre Processing

In its general sense, data preprocessing is a data mining technique to transform raw data into useful and analyzable form. It involves data cleaning, data transformation, and data reduction.
With respect to textual analysis, pre-processing involves multiple algorithms dedicated to convert a raw and imprecise data into cleaned and ready-to-analyse data. Each algorithm has its own specific objective to be fulfilled. This can be case conversion, lemmatization, counting word frequency, removal of punctuations, extraction of advanced entity, and so on. These algorithms are either used in singularity or in combination with other algorithms.
In rubiscape, the Pre Processing algorithms are,

  • Case Convertor
  • Custom Words Remover
  • Frequent Words Remover
  • Lemmatizer
  • Punctuation
  • Remover
  • Spelling Corrector
  • Stemmer
  • Advanced Entity Extraction
  • Word Correlation
  • Word Frequency

In the task pane, click Textual analysis, and then click Pre Processing.

For more information, refer to Pre-processing Algorithms

    • Related Articles

    • Pre-Processing

      It involves data cleaning, data transformation, and data reduction. Every textual data may not be ready Data preprocessing is a data mining technique that involves transforming raw data into an understandable and useful format. Real-world data is ...
    • Custom Words Remover

      Custom Words Remover Description Custom words remover eliminates the user-specified custom word/words before further processing. Why to use Textual Analysis – Pre Processing When to use When user defined custom words are to be removed from the ...
    • Custom Words Remover

      Custom Words Remover Description Custom words remover eliminates the user-specified custom word/words before further processing. Why to use Textual Analysis – Pre Processing When to use When user defined custom words are to be removed from the ...
    • Frequent Words Remover

      Frequent Words Remover Description Frequent words remover eliminates the frequent word/words before further processing. These words are called as Stop words. Why to use Textual Analysis – Pre Processing When to use When you want to remove stop words ...
    • Frequent Words Remover

      Frequent Words Remover Description Frequent words remover eliminates the frequent word/words before further processing. These words are called as Stop words. Why to use Textual Analysis – Pre Processing When to use When you want to remove stop words ...