Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • M metaseq
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 95
    • Issues 95
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 41
    • Merge requests 41
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Infrastructure Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • Administrator
  • metaseq
  • Merge requests
  • !245

[checkpoint] Fix race condition with blob.

  • Review changes

  • Download
  • Email patches
  • Plain diff
Merged Administrator requested to merge checkpointrace into main Jul 22, 2022
  • Overview 3
  • Commits 2
  • Pipelines 0
  • Changes 1

Created by: stephenroller

Patch Description When downloading checkpoints from blob, we have a race condition where not all checkpoints may have finished downloading to local workers, thus causing an exception. This ensures everyone patiently waits on everyone else.

Testing steps cm3 branch

Assignee
Assign to
Reviewers
Request review from
Time tracking
Source branch: checkpointrace