Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • B bull
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 175
    • Issues 175
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 9
    • Merge requests 9
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Infrastructure Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • OptimalBits
  • bull
  • Issues
  • #1304
Closed
Open
Issue created May 02, 2019 by Administrator@rootContributor

Possible infinite loop if BRPOPLPUSH fails ?

Created by: smennesson

Description

Hello today we faced a problem that seemed to be an infinite loop on Bull library. Our service is hosted on Heroku with the Redis addon and today we reached the memory quota of the Redis DB. What happened is that we had an enormous log stack saying this:

BRPOPLPUSH { ReplyError: OOM command not allowed when used memory > 'maxmemory'. 
    at parseError (/app/node_modules/ioredis/node_modules/redis-parser/lib/parser.js:179:12) 
    at parseType (/app/node_modules/ioredis/node_modules/redis-parser/lib/parser.js:302:14) 
  command: 
   { name: 'brpoplpush', 
     args: 
      [ 'bull:<name-of-our-job>:wait', 
        'bull:<name-of-our-job>:active', 
        '5' ] } } 

The log stack took up to 1gb in a few minutes until we fix the quota issue.

By looking a little bit to the code in lib/queue.js, it seems that the error on BRPOPLPUSH is ignored in Queue.prototype.getNextJob. So I guess that what happened is that the loop searching for new jobs to process was infinitely popping the error.

I don't have enough knowledge about how Bull internally works to propose a fix, but I think this is something that should be handled, maybe by detecting when there is several errors on BRPOPLPUSH and add a waiting duration when this happens to frequently.

Bull version

v3.7.0

(just seen that 3.8 is out ; it doesn't seem that it would be fixed in this version by reading the changelog)

Assignee
Assign to
Time tracking