feat(sdk): gate supernode candidates by minimum version (>= 2.5.0)#293
Merged
Conversation
Add an SDK-side compatibility gate that refuses to upload to supernodes
running a version older than pkg/version.MinSupernodeVersion (2.5.0).
WHY
---
The chain upgrades atomically at the halt height; the supernode fleet
upgrades asynchronously over hours-to-days. During a rollout window an
SDK client that already speaks the post-upgrade contract (e.g. LEP-5
AvailabilityCommitment at register, requiring ChunkProof[] at finalize
on v1.12.0) can be paired with a supernode that lacks the matching code
path. The action then stalls in PROCESSING until expiry: per-action
data loss, no chain-wide impact, but a poor UX during rollouts.
The SDK is the single layer that can read both sides (chain params and
the candidate supernode's reported version) and refuse the pairing.
Mainnet operators choose their own upgrade timing, so a client-side
gate is the only available lever.
WHAT
----
* pkg/version/min_supernode.go — single source of truth constant
MinSupernodeVersion = "2.5.0". Pre-release suffixes on the floor
base (e.g. "2.5.0-rc1") are treated as eligible by design.
* pkg/version/compatibility.go — IsCompatibleSupernodeVersion(string)
using Masterminds/semver/v3. Compares against floor "2.5.0-0" so any
2.5.0 pre-release is accepted. Fail-closed on empty / unparseable
input: a supernode that cannot prove its version is presumed stale.
* sdk/task/task.go — filterEligibleSupernodesParallel now consults the
StatusResponse.version returned by the existing per-candidate
GetSupernodeStatus probe (no additional RPC). Failing nodes are
rejected with a clear reason that includes both reported and minimum
versions.
INVARIANTS
----------
| # | Invariant | Enforcement |
|---|--------------------------------------------------------|----------------------|
| I1| No SN < MinSupernodeVersion reaches upload | filter (sole entry) |
| I2| Missing / unparseable version is rejected (fail-closed)| same point |
| I3| Single source of truth for the floor | pkg/version constant |
| I4| 2.5.0-rc* / -beta / +build accepted | semver -0 floor |
TESTS
-----
* pkg/version/compatibility_test.go covers 25 rows including the floor,
patch/minor/major above, all rc/beta/alpha forms of 2.5.0, the -0
boundary, pre-floor rcs (must reject), empty / whitespace / garbage,
semver-padded inputs ("2.5", "2"), and a self-compatibility guard
that protects against an unparseable MinSupernodeVersion constant.
NOT IN THIS PR
--------------
* Chain-side enforcement (out of scope; SDK is the available lever).
* sdk-go / sdk-js / sdk-rs — sdk-go inherits this automatically on
next dep bump; the other SDKs need their own gates (follow-ups).
* End-to-end live devnet test: deferred until a v2.5.0-rc1 supernode
binary exists. The 25-row unit-test matrix proves the comparator
contract today.
The cascade-e2e CI job was failing on this branch with
"no eligible supernodes to register" (SDK event:
sdk:supernodes_found count=0 total=3).
Root cause: tests/scripts/setup-supernodes.sh built the test binary
with -ldflags="-s -w" only, leaving supernode/cmd.Version at its
default "dev". The supernode metrics collector parses "dev" via
supernode/supernode_metrics/metrics_collection.go::parseVersion,
which falls back to [2,0,0] for unparseable inputs. StatusResponse
then reports version="dev"; the new SDK gate
(pkg/version.IsCompatibleSupernodeVersion) rejects it as it is
below MinSupernodeVersion=2.5.0, and filterEligibleSupernodesParallel
drops every candidate.
Fix: inject SUPERNODE_TEST_VERSION (default "2.5.0-test") via -X
into cmd.Version, exactly mirroring the production Makefile's
LDFLAGS pattern. The "-test" suffix is a valid semver prerelease
on the 2.5.0 base, so the gate accepts it.
Sibling audit: only one go-build call site exists in
tests/scripts/setup-supernodes.sh (setup_primary); setup_secondary
copies the primary binary. The tests/system/e2e_sn_manager_test.go
buildSN call is for the sn-manager-e2e-tests job, which is currently
commented out in .github/workflows/tests.yml and does not exercise
the cascade SDK gate. Out of scope for this fix.
Local validation:
go build -ldflags=\"-X .../cmd.Version=2.5.0-test\" ...
./supernode version -> "Version: 2.5.0-test"
IsCompatibleSupernodeVersion("2.5.0-test") -> true
j-rafique
approved these changes
May 12, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds an SDK-side compatibility gate that refuses to upload to supernodes running a version older than
pkg/version.MinSupernodeVersion(currently2.5.0).Closes the only real activation-hazard surfaced in the v1.11.1 → v1.12.0 upgrade-survival review: an SDK client that has adopted the post-upgrade contract (e.g. LEP-5
AvailabilityCommitmentat register, requiringChunkProof[]at finalize) being paired with a pre-LEP-5 supernode during the asynchronous mainnet rollout window.Why
Without a gate, a v1.12.0 client paired with a v2.4.x supernode submits an action that no SN can finalize. The action sits in
PROCESSINGuntil expiry → per-action data loss, no chain-wide impact, but visibly broken uploads during the rollout window.The SDK is the only layer that can read both sides (chain params + candidate SN's reported version) and refuse the pairing. This PR is that gate.
What
pkg/version/min_supernode.goMinSupernodeVersion = "2.5.0"— single source of truthpkg/version/compatibility.goIsCompatibleSupernodeVersion(reported string) boolusingMasterminds/semver/v3. Floor is compared as2.5.0-0so all2.5.0-rc*/-beta/+metapre-releases are eligible. Fail-closed on empty / unparseable input.pkg/version/compatibility_test.gosdk/task/task.gofilterEligibleSupernodesParallelconsultsStatusResponse.versionfrom the existing per-candidateGetSupernodeStatusprobe (no new RPC). Rejected nodes get a clear reason:supernode version X is below SDK minimum 2.5.0.Invariants
< 2.5.0reaches the upload stepfilterEligibleSupernodesParallel(sole filter entry)pkg/version/min_supernode.goconst"2.5.0"literal outsidepkg/version2.5.0-rc*,-beta,+buildof the floor base are eligible2.5.0-02.5.0-rc1,v2.5.0-rc2,2.5.0-rc2+build.5,-beta,-alpha.1,-0boundary, and negative2.4.99-rc1Test output
Rollout semantics
2.5.0-rc1is eligible — this is intentional and confirmed with @mateeullahmalik. The forthcoming SN release will tag asv2.5.0-rc1first; the gate must let it through.2.4.72and earlier are rejected — that is the entire point.v2.5.0is eligible — leadingvis tolerated.What is NOT in this PR
sdk-go/sdk-js/sdk-rsdirect changes —sdk-gopicks this up automatically on the next supernode dep bump (it importspkg/versiontransitively viasdk/task).sdk-jsandsdk-rsneed their own gates (follow-up issues to file).v2.5.0-rc1supernode binary is cut. Today there is no SN tag at or above the floor, so a live test would just confirm that the gate rejects everything (which the unit tests already prove deterministically). A live E2E should be run as part of thev2.5.0-rc1release validation against a mixed-version fleet (one stale v2.4.72 + one v2.5.0-rc1). I'll run it then via thecascade-register-rs-snapiskill.Risks & mitigations
version(regression in a future SN build)version <unreported> is below SDK minimum 2.5.0. No silent passthrough."2.5.0"elsewhere)TestMinSupernodeVersion_IsValidSemverguards this — package will not compile cleanly.Rollback
Revert is safe and clean: the change is additive within an already-failing branch of
filterEligibleSupernodesParallel. Reverting removes the version check; the existing peers / health / balance gates remain.