Skip to content

[Bug] Failed PR metadata jobs can block future metadata refreshes #75

@volcano303

Description

@volcano303

Description

PullRequestHandler enqueues PR metadata refreshes with one reusable BullMQ job id per PR:

const jobId = `meta-${repoFullName}-${prNumber}`;
...
removeOnComplete: true,
removeOnFail: 50,
attempts: 3,

BullMQ treats custom job ids as unique while an old job with that id still exists. Its documented behavior is that adding a job with an existing id is ignored until the old job is removed.

Because failed jobs are retained by removeOnFail: 50, a fetch-pr-metadata job that exhausts retries can remain in the failed set and block later metadata refreshes for the same PR. Later pull_request.edited, pull_request.closed, pull_request.reopened, or pull_request.synchronize webhooks call fetchQueue.add() with the same meta-<repo>-<pr> id, but BullMQ can ignore the enqueue because the failed job is still retained.

This leaves PR metadata stale even after GitHub sends later webhook events.

Steps to Reproduce

  1. Trigger a pull_request event that enqueues fetch-pr-metadata for a tracked PR.
  2. Make that metadata job fail all retry attempts, for example through a transient GitHub GraphQL/API failure.
  3. Observe that the failed job remains retained because the queue options use removeOnFail: 50.
  4. Deliver a later pull_request.edited, pull_request.closed, pull_request.reopened, or pull_request.synchronize webhook for the same PR.
  5. Observe that the handler tries to enqueue the same job id, meta-<repoFullName>-<prNumber>.
  6. The fresh metadata refresh can be skipped because the retained failed job still owns that custom job id.

Expected Behavior

Later PR webhook events should be able to enqueue a fresh metadata refresh after a previous metadata job failed.

The mirror should eventually update:

  • pull_requests.body
  • pull_requests.last_edited_at
  • pull_requests.closing_issue_numbers
  • issues.solved_by_pr for merged PR closing references

Actual Behavior

A retained failed metadata job can cause later refresh attempts for the same PR to be ignored by BullMQ. The mirror can keep stale or missing PR metadata until the failed job is manually removed or evicted by retention.

Environment

  • OS: Any
  • Runtime/Node version: Node 20
  • Browser (if applicable): N/A

Additional Context

Affected code:

  • packages/das/src/webhook/handlers/pull-request.handler.ts
  • packages/das/src/queue/fetch.processor.ts

This is distinct from existing issues:

This issue is specifically about retained failed jobs with reusable PR metadata job ids blocking future refreshes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions