Conversation
There was a problem hiding this comment.
Pull request overview
Implements retry logic for sending flow definitions so Draco can resend definitions after reconnecting when Sagittarius/Aquila is temporarily unavailable (Issue #189).
Changes:
- Replace one-shot definitions send with a retry loop that waits between attempts.
- Update
code0-flowdependency from0.0.30to0.0.31(and lockfile updates).
Reviewed changes
Copilot reviewed 2 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| crates/base/src/runner.rs | Adds retry loop around definitions update during startup. |
| Cargo.toml | Bumps workspace code0-flow version. |
| Cargo.lock | Updates resolved dependency versions/checksums accordingly. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| let mut success = false; | ||
| let mut count = 1; | ||
| while !success { | ||
| success = definition_service.send_with_status().await; | ||
| if success { | ||
| break; | ||
| } | ||
|
|
||
| log::warn!( | ||
| "Updating definitions failed, trying again in 2 secs (retry number {})", | ||
| count | ||
| ); | ||
| count += 1; | ||
| sleep(Duration::from_secs(3)).await; | ||
| } |
There was a problem hiding this comment.
The retry loop for sending definitions runs before the server init/run and before the shutdown signal tokio::select! blocks, so if Aquarius/Sagittarius stays unavailable this can block startup indefinitely and prevent Ctrl+C/SIGTERM from being handled. Consider adding cancellation (e.g., tokio::select! on ctrl_c/sigterm) and/or moving definition retries into a background task with a bounded retry/backoff policy so the process can still start/shutdown cleanly.
| log::warn!( | ||
| "Updating definitions failed, trying again in 2 secs (retry number {})", | ||
| count | ||
| ); | ||
| count += 1; | ||
| sleep(Duration::from_secs(3)).await; | ||
| } |
There was a problem hiding this comment.
The log message says the next retry is in 2 seconds, but the code sleeps for 3 seconds. Please make the message and Duration::from_secs(...) consistent so operators can trust the logs.
| let mut success = false; | ||
| let mut count = 1; | ||
| while !success { | ||
| success = definition_service.send_with_status().await; | ||
| if success { |
There was a problem hiding this comment.
while !success { ... if success { break; } } is redundant and makes the loop harder to read. Consider using a loop {} with a single break when the send succeeds (or returning early), which also avoids tracking success separately.
| let mut success = false; | |
| let mut count = 1; | |
| while !success { | |
| success = definition_service.send_with_status().await; | |
| if success { | |
| let mut count = 1; | |
| loop { | |
| if definition_service.send_with_status().await { |
Resolves: #189