Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • P pyod
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 144
    • Issues 144
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 16
    • Merge requests 16
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Infrastructure Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • Yue Zhao
  • pyod
  • Issues
  • #79
Closed
Open
Issue created Apr 19, 2019 by Administrator@rootContributor

IForest: FutureWarning: behaviour="old" is deprecated

Created by: bflammers

Hi,

Thanks for a great library!

When declaring a new IForest object, Sklearn throws the following warning:

FutureWarning: behaviour="old" is deprecated and will be removed in version 0.22. Please use behaviour="new", which makes the decision_function change to match other anomaly detection algorithm API. FutureWarning)

This new behavior in sklearn's iforest is about where the threshold is set between anomalies and normal observations. See documentation on behaviour argument and offset_:

behaviour : str, default='old' Behaviour of the decision_function which can be either 'old' or 'new'. Passing behaviour='new' makes the decision_function change to match other anomaly detection algorithm API which will be the default behaviour in the future. As explained in details in the offset_ attribute documentation, the decision_function becomes dependent on the contamination parameter, in such a way that 0 becomes its natural threshold to detect outliers.

offset_ : float Offset used to define the decision function from the raw scores. We have the relation: decision_function = score_samples - offset_. Assuming behaviour == 'new', offset_ is defined as follows. When the contamination parameter is set to "auto", the offset is equal to -0.5 as the scores of inliers are close to 0 and the scores of outliers are close to -1. When a contamination parameter different than "auto" is provided, the offset is defined in such a way we obtain the expected number of outliers (samples with decision function < 0) in training. Assuming the behaviour parameter is set to 'old', we always have offset_ = -0.5, making the decision function independent from the contamination parameter.

I think a simple fix would be to add argument behaviour="new" in the call to sklearn.ensemble.IsolationForest

Assignee
Assign to
Time tracking