Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • M metaseq
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 95
    • Issues 95
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 41
    • Merge requests 41
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Infrastructure Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • Administrator
  • metaseq
  • Merge requests
  • !290

moving future_mask to cuda for document attenion

  • Review changes

  • Download
  • Email patches
  • Plain diff
Merged Administrator requested to merge kchakrabarty/docattentionspeedup into main Aug 04, 2022
  • Overview 4
  • Commits 3
  • Pipelines 0
  • Changes 1

Created by: KUNAL1612

Patch Description Moved the future mask to CUDA so that all operations for document attention also take place on CUDA. Refer to issue #285 for context. The speedup offered by this is marginal but over multiple runs may cumulate.

To test, I set attn doc seperator to a random number to trigger the code to enter the branch, and timed it using the methods used in #220 (closed). Observed that over 436 calls to the function on CPU it takes 0.42 seconds vs 0.09 seconds on GPU.

Assignee
Assign to
Reviewers
Request review from
Time tracking
Source branch: kchakrabarty/docattentionspeedup