Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • P pyod
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 144
    • Issues 144
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 16
    • Merge requests 16
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Infrastructure Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • Yue Zhao
  • pyod
  • Issues
  • #434
Closed
Open
Issue created Sep 02, 2022 by Lucew@Lucew

Pytorch Autoencoder starts with Batchnorm before first Layer

Dear Contributors,

if BatchNorm is enabled in the options the PyTorch AutoEncoder starts with a BatchNorm before passing the input samples to the first linear layer. This might be unwanted behavior. Normally the input data should be preprocessed and passed into the first linear layer.

IMHO, BatchNorm should happen after the data has been processed by the first layer. Having preprocessing by Normalization and immediately applying BatchNorm to the data would be somehow doubled.

This behavior would then also differ from the TensorFlow implementation where the data gets passed directly into the first linear layer: https://github.com/yzhao062/pyod/blob/master/pyod/models/auto_encoder.py#L172-L179

EDIT: I just noticed there is also a dropout layer after the last linear layer (and activation). This should IMHO also not be the case, as this would purposely set elements of the reconstructed sample (output) to zero. This also differs from the TensorFlow implementation. See https://github.com/yzhao062/pyod/blob/master/pyod/models/auto_encoder.py#L190-L191 Also the DropOut is applied before the batch norm which will definitely skew the result. The batch norm won't be able to learn a good normalization if dropout is randomly applied right before (and turned off for inference).

TL;DR

Currently the order is: Input->BatchNorm->Neurons->Activation->DropOut->BatchNorm->.... IMHO it should be: Input->Neurons->BatchNorm->Activation->DropOut->Neurons->....


I verified my suspicion by running the PyTorch AutoEncoder Example with print(cls) after Line 39.

The output was the following:

AutoEncoder(batch_norm=True, batch_size=32, contamination=0.1,
      device=device(type='cpu'), dropout_rate=0.2, epochs=10,
      hidden_activation='relu', hidden_neurons=[64, 32],
      learning_rate=0.001, loss_fn=MSELoss(), preprocessing=True,
      weight_decay=1e-05)
inner_autoencoder(
  (activation): ReLU()
  (encoder): Sequential(
    (batch_norm0): BatchNorm1d(300, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (linear0): Linear(in_features=300, out_features=64, bias=True)
    (relu0): ReLU()
    (dropout0): Dropout(p=0.2, inplace=False)
    (batch_norm1): BatchNorm1d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (linear1): Linear(in_features=64, out_features=32, bias=True)
    (relu1): ReLU()
    (dropout1): Dropout(p=0.2, inplace=False)
  )
  (decoder): Sequential(
    (batch_norm0): BatchNorm1d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (linear0): Linear(in_features=32, out_features=64, bias=True)
    (dropout0): Dropout(p=0.2, inplace=False)
    (batch_norm1): BatchNorm1d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (linear1): Linear(in_features=64, out_features=300, bias=True)
    (dropout1): Dropout(p=0.2, inplace=False)
  )
)

There you can see that the first element in the sequential of the encoder is a BatchNorm. A Pullrequest resolving this issue is on the way. Thanks for reading! (EDIT: See Pullrequest #435 )

If you are annoyed by my fine-grained Pullrequests or issues, please let me know. I don't want to cause any extra work.

Assignee
Assign to
Time tracking