Skip to content

Conversation

@stamtos
Copy link

@stamtos stamtos commented Jan 26, 2026

Summary

This PR fixes a bug where job batches could remain in the "In Progress" state indefinitely, even after all associated jobs had finished.

Problem

Two main issues were identified in the check_state trigger logic:

  1. Race Condition via Deduplication: Using identity_exact with with_delay() caused the check_state job for the last finishing batch job to be deduplicated (skipped) if a previous check_state job from the same batch was still pending or enqueued. This meant the final state transition to "Finished" might never happen.
  2. Missing Terminal States: The trigger was only looking for the done state, ignoring jobs that ended in failed or cancelled.

Solution

  • Removed identity_key=identity_exact from the check_state delay call to ensure every job completion attempts a state check, preventing the race condition.
  • Updated the trigger condition to include all terminal states: done, cancelled, and failed.

Fixes #810

- Remove identity_exact from check_state delay to prevent race conditions
- Trigger check_state for all terminal states (done, cancelled, failed)

Fixes OCA#810
Add tests to verify:
- Failed jobs trigger check_state on the batch
- Cancelled jobs trigger check_state on the batch
- No deduplication occurs when multiple jobs complete (race condition fix)
@stamtos stamtos force-pushed the 18.0-fix-batch-stuck-progress branch from 5bcaf4f to 3783e9a Compare January 26, 2026 06:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[18.0] Job batches stuck in In Progress

1 participant