Isolation Forest

Isolation Forest
Description	Isolation Forest is an unsupervised algorithm used for anomaly detection that isolates the anomalies rather than building a model of normal instances.
Why to use	Isolation forest detects anomalies faster and requires less memory space compared to other anomaly detection algorithms.
When to use	To handle high-dimensional and large-sized input data	When not to use	Inappropriate feature extraction & defining normal and abnormal behaviour in the data, variations in the abnormal data increase the dataset's complexity. In such cases, isolation forest cannot be used.
Prerequisites	Data should contain only numeric/Continuous datatype variables. Data should not contain any missing values.
Input	Any classification dataset with numeric input variables	Output	Anomaly scores Anomaly labels Score samples cluster plot containing inliers Outliers (anomalies). The input dataset is classified into two categories as 1 and -1. -1 implies-Outliers. 1 implies normal data points.
Statistical Methods used	It works on the principle of the decision tree algorithm. It works on the principle of decision tree algorithms, but that cannot be defined in the statistical methods used section as a decision tree is an ML algorithm.	Limitations	It fails to detect local anomaly points, which affects the accuracy of the algorithm.

Related Articles
Random Forest
Random Forest Description Random Forest is a Supervised Machine Learning algorithm. It works on the Bagging (Bootstrap Aggregation) principle of the Ensemble technique. Thus, it uses multiple models instead of a single model to make predictions. It ...
Random Forest Regression
Random Forest Regression Description Random Forest Regression is an ensemble learning method that combines multiple decision trees to create a powerful predictive model for continuous target variables. It utilizes random feature selection to improve ...
Rubiscape Autumn '21
New Features Platform & Studio Data Dictionary - Ability to create, edit, delete Data Dictionary JSON Dataset – Ability to create, edit, delete JSON file dataset Algorithms added: Count Vectorization TFIDF Algorithm SMOTE Algorithm – Detection and ...
rubiscape Platform Architecture
The three phases of data analysis are input, insight, and impact. These are explained below. Input: Inputs are nothing but different sources of data. These include various location data, transactional databases, social media data, mobile application ...
Rubiscape Autumn '20
New Features Platform & Studio Dataset: S3 dataset – Ability to create, edit, delete S3 dataset SAP HANA – Ability to create, edit, delete HANA dataset Algorithms added: Factor Analysis PCA MLP Neural Network Regression Ridge Regression Lasso ...

Related Articles