Anomaly Detection

Anomaly Detection

Anomaly detection is the discovery or classification of events or observations that differ substantially from most of the data. Anomalies are also known as outliers, deviations, novelties, exceptions, or noise. Anomaly detection is categorized into three techniques as given below.

  1. Unsupervised
  2. Supervised
  3. Semi-supervised
Unsupervised anomaly detection techniques assume that the majority of the data points are normal. The techniques look for data points that fit the least in the remaining data points in a dataset to detect anomalies.

Supervised anomaly detection techniques detect anomalies in a dataset with data points labeled as "normal" and "abnormal". It involves training a classifier to remove the anomalies.

Semi-supervised anomaly detection techniques build a model which represents normal behavior from a given trained dataset. It then tests the likelihood of outliers being generated by the model.

Anomaly detection is applicable in,

  1. detecting intrusions
  2. detecting frauds
  3. detecting faults
  4. monitoring system health
  5. detecting ecosystem disturbances
  6. detecting defects in images using machine vision
Anomaly detection algorithms are used in data preprocessing to remove inconsistent data from a dataset. In supervised learning, it is an important step of data preprocessing to train a dataset, also known as data cleansing. 



    • Related Articles

    • Outlier Detection

      Outlier Detection Description Outlier Detection reveals the extreme values that deviate from the rest of the data in a real-world dataset. Why to use Numerical Analysis – Data Preparation When to use When there are certain values in the data which ...
    • Outlier Detection

      Outlier Detection Description Outlier Detection reveals the extreme values that deviate from the rest of the data in a real-world dataset. Why to use Numerical Analysis – Data Preparation When to use When there are certain values in the data which ...
    • Rubiscape Winter '22

      New Features Platform & Studio On-Prem Autoscaling Support for horizontal autoscaling for on-prem deployments of Rubiscape. Data Cleaning Ability to fix common data quality issues such as remove/replace null data, remove punctuations, capitalization, ...
    • Isolation Forest

      Isolation Forest Description Isolation Forest is an unsupervised algorithm used for anomaly detection that isolates the anomalies rather than building a model of normal instances. Why to use Isolation forest detects anomalies faster and requires less ...
    • One Class SVM

      One-Class SVM Transformation Description One-Class Support Vector Machine (One Class SVM) is an unsupervised variation of SVM used for anomaly detection. One-Class SVM is an unsupervised algorithm for outlier detection. It detects whether a new data ...