feat(daemon): implement UpgradeMonitor goroutine for NetworkUpgradeExecute CRs (#519)#609
Merged
leninmehedy merged 11 commits intoMay 28, 2026
Conversation
✅ Snyk checks have passed. No issues have been found so far.
💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse. |
Contributor
There was a problem hiding this comment.
Pull request overview
Implements the previously stubbed UpgradeMonitor for the solo-provisioner-daemon, adding a Kubernetes watch loop for NetworkUpgradeExecute CRs and wiring it into daemon startup with a fail-fast daemon.yaml configuration load.
Changes:
- Implement
UpgradeMonitorwatch/reconnect loop with exponential backoff and auth-error kubeconfig refresh. - Add
daemon.yamlparsing/validation at daemon startup and update daemon construction to be error-returning. - Add unit tests (fake dynamic client) and test-only exports/helpers.
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/models/weaver_paths.go | Adds DaemonConfigPath so daemon startup can locate daemon.yaml via standard paths. |
| internal/daemon/export_test.go | Adds a test-only constructor to inject daemon sub-systems. |
| internal/daemon/errors.go | Introduces typed daemon config error (ErrConfig). |
| internal/daemon/daemon.go | Changes New to load daemon.yaml, build UpgradeMonitor, and return (*Daemon, error). |
| internal/daemon/config.go | Implements DaemonConfig + LoadDaemonConfig with required-field validation. |
| internal/daemon/consensus/errors.go | Adds consensus-scoped typed errors for K8s client/watch failures. |
| internal/daemon/consensus/upgrade_monitor.go | Full UpgradeMonitor implementation (watch loop, backoff, auth rebuild, dedup, panic recovery). |
| internal/daemon/consensus/export_test.go | Exposes isAuthError to white-box tests (test-only). |
| internal/daemon/consensus/upgrade_monitor_test.go | Adds unit tests using dynamic/fake for basic watch-loop scenarios and auth-error detection. |
| cmd/daemon/main.go | Updates daemon bootstrap to handle daemon.New returning an error. |
| docs/claude/reviews/00519-implement-upgrade-monitor-goroutine.md | Adds UAT/review guide for the new monitor + config behavior. |
| docs/claude/plans/00519-implement-upgrade-monitor-goroutine.md | Adds design/plan document for the UpgradeMonitor story. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
d9d32f5 to
ca92284
Compare
21475fe to
dd41ccb
Compare
83a3895 to
981a868
Compare
083d40c to
1967568
Compare
1a0b2bc to
7952f26
Compare
Contributor
|
There's just a minor DCO issue that needs to be fixed. |
7952f26 to
b8aa616
Compare
…ecute CRs (#519) Watches NetworkUpgradeExecute CRs for ReadyForProvisionerDaemon phase transitions and triggers the execute-phase workflow (stub — full logic in subsequent stories). Self-healing: all watch errors retry with exponential backoff (2 s → 5 min); auth errors additionally rebuild the dynamic client from kubeconfig on disk so the daemon recovers after RBAC is applied without a manual restart. - consensus/upgrade_monitor.go: UpgradeMonitor with Run/runWatch/handleEvent/ handleExecute (stub); buildDynamicClient; isAuthError; operationId dedup - daemon/config.go: LoadDaemonConfig reads daemon.yaml (kubeconfig + orbit); fails fast if missing or fields empty - daemon/daemon.go: New() reads daemon.yaml and constructs UpgradeMonitor; Run() adds it to errgroup - daemon/export_test.go: NewWithComponents test helper (test-only, not in production API) - pkg/models/weaver_paths.go: DaemonConfigPath = $home/config/daemon.yaml - upgrade_monitor_test.go: 4 unit tests via fake dynamic client Signed-off-by: Lenin Mehedy <lenin.mehedy@hashgraph.com>
…pgrade/migrate workflows Signed-off-by: Lenin Mehedy <lenin.mehedy@hashgraph.com>
…olicy for daemon and UC Signed-off-by: Lenin Mehedy <lenin.mehedy@hashgraph.com>
…tured journald logs Signed-off-by: Lenin Mehedy <lenin.mehedy@hashgraph.com>
…ovisioner daemon check consumer Signed-off-by: Lenin Mehedy <lenin.mehedy@hashgraph.com>
Signed-off-by: Lenin Mehedy <lenin.mehedy@hashgraph.com>
…leExecute stub Signed-off-by: Lenin Mehedy <lenin.mehedy@hashgraph.com>
…guide to daemon.yaml approach - Add NodeID (required) and UpgradeDir (optional, default /opt/solo/weaver/upgrade) fields to DaemonConfig and LoadDaemonConfig validation - Add DaemonConfig.upgradeDir() helper with default fallback - Update daemon.yaml example in plan + review docs with all four fields - Replace CLI flags table in implementation-guide.md with daemon.yaml field table; drop --poll-interval (UpgradeMonitor is watch-based, not poll-based); clean up ExecStart example to show no flags Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> Signed-off-by: Lenin Mehedy <lenin.mehedy@hashgraph.com>
…ns table Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> Signed-off-by: Lenin Mehedy <lenin.mehedy@hashgraph.com>
…fields Add --node-id, --kubeconfig, --orbit, --upgrade-dir flags to solo-provisioner-daemon. Each flag is optional and overrides the corresponding daemon.yaml field when set. The production service file remains flag-free; flags are for operator debugging and CI integration testing without requiring a daemon.yaml file on disk. Implementation: - Extract DaemonConfig.Validate() from LoadDaemonConfig so validation can be called after overrides are applied - Add NewFromConfig(paths, cfg) constructor for callers that hold a pre-resolved config; New() wraps it for the file-only production path - cmd/daemon/main.go: load daemon.yaml, apply flag overrides, re-validate, call NewFromConfig Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> Signed-off-by: Lenin Mehedy <lenin.mehedy@hashgraph.com>
b8aa616 to
326e201
Compare
Signed-off-by: Lenin Mehedy <lenin.mehedy@hashgraph.com>
4349009 to
4c801f2
Compare
db9fe60
into
00499-feat-solo-provisioner-daemon-core
16 checks passed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
daemon.yaml(written byprovisioner daemon install) — service file has no flags, stays identical on every node; addsnode_id(required, used asnodeIdin JSONL events) andupgrade_dir(optional, defaults to/opt/hgcapp/services-hedera/HapiApp2.0/data/upgrade/current)--node-id,--kubeconfig,--orbit,--upgrade-dir) override the correspondingdaemon.yamlfields when set — useful for operator debugging and CI testing without a config file on disk; production deployments set no flagshandleExecuteis a stub — full workflow and JSONL audit trail land in subsequent stories; design is captured indocs/claude/plans/eventlog-jsonl-upgrade-event-logger.mdTest plan
task vm:test:unit— 6 new unit tests covering trigger, dedup, busy-rejection, phase filtering, and auth-error detectiontask test:coverage TEST_PATHS=./internal/daemon/... TEST_REGEX="."docs/claude/reviews/00519-implement-upgrade-monitor-goroutine.mdfor full 9-step walkthrough🤖 Generated with Claude Code