Frequent Words Remover

Frequent Words Remover

Frequent Words Remover 

Description

Frequent words remover eliminates the frequent word/words before further processing. These words are called as Stop words

Why to use

Textual Analysis – Pre Processing 

When to use

When you want to remove stop words from the data. 

When not to use

On numerical data.

Prerequisites

It should be textual data. 

Input

Nick likes to play football, however he is not too fond of tennis.

Output

'Nick', 'likes', 'play', 'football', ',', 'however', 'fond', 'tennis', '.'
In this example, the stop words have been removed. For list of stop words in NLTK, refer to Table.

Related algorithms

  • Case Convertor
  • Custom Words Remover
  • Lemmatizer
  • Punctuation Remover
  • Spelling Corrector
  • Stemmer
  • Advanced Entity Extraction
  • Word Correlation
  • Word Frequency

Alternative algorithm

-

Statistical Methods used

-

Limitations

It cannot be used on Numerical data.


One of the major tasks of data pre-processing is to filter out useless data. It is also called as text mining. In NLP, useless words are called stop words. Frequent words remover eliminates the frequent word/words before further processing. The output is a text devoid of stop words. This helps you to extract your data as required.

Stop words

'ourselves', 'hers', 'between', 'yourself', 'but', 'again', 'there', 'about', 'once', 'during', 'out', 'very', 'having', 'with', 'they', 'own', 'an', 'be', 'some', 'for', 'do', 'its', 'yours', 'such', 'into', 'of', 'most', 'itself', 'other', 'off', 'is', 's', 'am', 'or', 'who', 'as', 'from', 'him', 'each', 'the', 'themselves', 'until', 'below', 'are', 'we', 'these', 'your', 'his', 'through', 'don', 'nor', 'me', 'were', 'her', 'more', 'himself', 'this', 'down', 'should', 'our', 'their', 'while', 'above', 'both', 'up', 'to', 'ours', 'had', 'she', 'all', 'no', 'when', 'at', 'any', 'before', 'them', 'same', 'and', 'been', 'have', 'in', 'will', 'on', 'does', 'yourselves', 'then', 'that', 'because', 'what', 'over', 'why', 'so', 'can', 'did', 'not', 'now', 'under', 'he', 'you', 'herself', 'has', 'just', 'where', 'too', 'only', 'myself', 'which', 'those', 'i', 'after', 'few', 'whom', 't', 'being', 'if', 'theirs', 'my', 'against', 'a', 'by', 'doing', 'it', 'how', 'further', 'was', 'here', 'than'

    • Related Articles

    • Frequent Words Remover

      Frequent Words Remover Description Frequent words remover eliminates the frequent word/words before further processing. These words are called as Stop words. Why to use Textual Analysis – Pre Processing When to use When you want to remove stop words ...
    • Custom Words Remover

      Custom Words Remover Description Custom words remover eliminates the user-specified custom word/words before further processing. Why to use Textual Analysis – Pre Processing When to use When user defined custom words are to be removed from the ...
    • Custom Words Remover

      Custom Words Remover Description Custom words remover eliminates the user-specified custom word/words before further processing. Why to use Textual Analysis – Pre Processing When to use When user defined custom words are to be removed from the ...
    • Punctuation Remover

      Punctuation Remover Description Punctuation remover is an algorithm used to remove punctuation marks like a full stop, comma, semi-colon, question mark, exclamatory mark, and other such punctuation marks from the given text. Why to use Textual ...
    • Punctuation Remover

      Punctuation Remover Description Punctuation remover is an algorithm used to remove punctuation marks like a full stop, comma, semi-colon, question mark, exclamatory mark, and other such punctuation marks from the given text. Why to use Textual ...