Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • B bull
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 175
    • Issues 175
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 9
    • Merge requests 9
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Infrastructure Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • OptimalBits
  • bull
  • Issues
  • #1705
Closed
Open
Issue created Apr 22, 2020 by Administrator@rootContributor

Jobs with delay < -new Date().getTime() never complete and cause 100% CPU usage

Created by: harris-m

Description

If you add a job with a delay value less than the negative unix epoch, it never runs and bull thrashes constantly running updateDelaySet.

Minimal, Working Test code to reproduce the issue.

(An easy to reproduce test case will dramatically decrease the resolution time.)

var Queue = require("bull");

var testQueue = new Queue("test_queue", "redis://127.0.0.1:6379");

testQueue.process(function (job, done) {
  console.log("job complete");
  done();
});

// this works fine
// testQueue.add({}, { delay: -new Date().getTime() });

// subtract 1
// this never completes and locks CPU at 100%
testQueue.add({}, { delay: -new Date().getTime() - 1 });

Bull version

3.13.0

Additional information

I have created a pull request which fixes this issue: https://github.com/OptimalBits/bull/pull/1706

Explanation (as best as I could work out):

I noticed that my application using bull had maxed out CPU usage, on a node worker and redis. I used redis-cli monitor to investigate:

1587536426.277449 [0 172.17.0.1:47268] "evalsha" "8b912cdc5b4c20108ef73d952464fba3a7470d7b" "6" "bull:test_queue:delayed" "bull:test_queue:active" "bull:test_queue:wait" "bull:test_queue:priority" "bull:test_queue:paused" "bull:test_queue:meta-paused" "bull:test_queue:" "1587536426276" "e94d3501-42b5-459d-8f15-854d6900ef18"
1587536426.277481 [0 lua] "ZRANGEBYSCORE" "bull:test_queue:delayed" "0" "6502549202026496" "LIMIT" "0" "1000"
1587536426.277491 [0 lua] "ZRANGE" "bull:test_queue:delayed" "0" "0" 
"WITHSCORES"
1587536426.277500 [0 lua] "PUBLISH" "bull:test_queue:delayed" "-3.1539841246358106e+17"

I found these lines repeating infinitely. I noticed the negative numbers and tried clamping my application's job's delays to 0, and the issue went away. Looking into bull's codebase, I found this line which led me to the core issue. delayedTimestamp is expected to be a positive value (a time in the future). However if it's a large enough negative value (less than negative unix epoch), delay is set to a value <= 0 and the function repeats, causing this infinite loop.

Assignee
Assign to
Time tracking