Stemmer

Stemmer

Stemmer 

Description

The automated process produces a base string in an attempt to represent related words.
For example, if the words are "runs", "running", and "runner", then the algorithm will automatically reduce these words to the root word "run". 

Why to use

Textual Analysis – Pre Processing 

When to use

When you want to get root form of words for the textual data. It mapping the group of words to the same stem, even if the stem itself is not a valid word in the language. 

When not to use

On numerical data.

Prerequisites

It is used on textual data 

Input

Program
programs
programmer
programming
programers

Output

program
program
program
program
program

Predecessor

  • Case Convertor
  • Custom Words Remover
  • Frequent Words Remover
  • Lemmatizer
  • Punctuation Remover
  • Spelling Corrector
  • Advanced Entity Extraction
  • Word Correlation
  • Word Frequency

Successor

  • Case Convertor
  • Custom Words Remover
  • Frequent Words Remover
  • Lemmatizer
  • Punctuation Remover
  • Spelling Corrector
  • Advanced Entity Extraction
  • Word Correlation
  • Word Frequency

Related algorithms

  • Case Convertor
  • Custom Words Remover
  • Frequent Words Remover
  • Lemmatizer
  • Punctuation Remover
  • Spelling Corrector
  • Advanced Entity Extraction
  • Word Correlation
  • Word Frequency 

Alternative algorithm

Lemmatizer

Statistical Methods used

-

Limitations

It can often create non-existent words that does not have any meaning.


Stemming is the process in information retrieval that reduces an inflected or derived word to its stem form or the root word form. It produces a base string to represent related words.
For example, the root word 'run' can represent all other words like 'runs', 'running', 'ran', and other forms of it.
    • Related Articles

    • Stemmer

      Stemmer Description The automated process produces a base string in an attempt to represent related words. For example, if the words are "runs", "running", and "runner", then the algorithm will automatically reduce these words to the root word "run". ...
    • Spelling Corrector

      Spelling Corrector Description Spelling Corrector enables you to correct the most cumbersome mistakes, with a high degree of accuracy and speed. Why to use Textual Analysis – Pre Processing When to use When there are words present in the textual data ...
    • Spelling Corrector

      Spelling Corrector Description Spelling Corrector enables you to correct the most cumbersome mistakes, with a high degree of accuracy and speed. Why to use Textual Analysis – Pre Processing When to use When there are words present in the textual data ...
    • Lemmatizer

      Lemmatizer Description Lemmatization usually refers to doing things properly with the use of a vocabulary and morphological analysis of words, normally aiming to remove inflectional endings only and to return the base or dictionary form of a word. ...
    • Lemmatizer

      Lemmatizer Description Lemmatization usually refers to doing things properly with the use of a vocabulary and morphological analysis of words, normally aiming to remove inflectional endings only and to return the base or dictionary form of a word. ...