Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • M metaseq
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 95
    • Issues 95
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 41
    • Merge requests 41
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Infrastructure Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • Administrator
  • metaseq
  • Merge requests
  • !511

[wip] Adding flash attention for sequence parallel

  • Review changes

  • Download
  • Email patches
  • Plain diff
Open Administrator requested to merge flash_attention_seqpar into main Nov 11, 2022
  • Overview 1
  • Commits 25
  • Pipelines 0
  • Changes 4

Created by: stephenroller

Patch Description Since we're doing manual activation checkpointing, we need to have custom backwards for MHA. This patch leverages the flash implementation in xformers.

TODO:

  • Gate behind the appropriate (existing) flag, and allow vanilla implementation to still exist
  • Add a test

Testing steps At very large scale, was a ~0.5-1% speedup. Probably not worth it at the largest scales given the risk of numeric changes, but maybe still worth it for medium scales.

Assignee
Assign to
Reviewers
Request review from
Time tracking
Source branch: flash_attention_seqpar