test(devnet): add LEP-6 chain-side lifecycle e2e tests#136
Conversation
601fedf to
953d47c
Compare
There was a problem hiding this comment.
Pull request overview
Adds a new devnet E2E test suite to exercise the LEP-6 storage-truth lifecycle against the live 5-validator Docker devnet, plus the necessary Makefile/docs wiring and devnet genesis parameter tweaks to make the lifecycle testable within reasonable time.
Changes:
- Add a new shell-driven LEP-6 chain-side devnet test runner (
devnet/tests/lep6/lep6_test.sh) covering params, epoch report, recheck evidence, negative authorization cases, and heal-op lifecycle. - Wire the suite into devnet workflows via
make devnet-tests-lep6and document it in devnet docs. - Adjust devnet genesis audit/LEP-6 parameters (e.g., divisor/mode/heal threshold / postpone-related knobs) for testability.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| Makefile.devnet | Adds devnet-tests-lep6 target to run the new suite. |
| devnet/tests/lep6/lep6_test.sh | New bash E2E runner implementing LEP-6 chain-side lifecycle tests. |
| devnet/default-config/devnet-genesis.json | Tweaks audit/LEP-6 params to make devnet lifecycle testing feasible. |
| docs/devnet/tests.md | Documents the new LEP-6 suite alongside existing devnet test suites. |
| docs/devnet/makefile-commands.md | Documents the new make devnet-tests-lep6 target. |
| docs/plans/LEP6_DEVNET_TEST_PLAN.md | Adds an implementation plan/runbook for the LEP-6 devnet test suite. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
63ae6e3 to
2e9ec94
Compare
Closes the missing devnet-test gap for LEP-6 storage-truth (Anton: 'LEP-6
doesn't even have devnet tests'). Chain-side lifecycle e2e under the live
5-validator Docker devnet, modeled on devnet/tests/everlight/everlight_test.sh.
Tests (4, all chain-side, no supernode runtime dependency):
T1 TestLEP6_ParamsAndEpochAnchor — params readback, divisor/mode/epoch_length
sanity, current-epoch-anchor query, active SN set check.
T2 TestLEP6_SubmitEpochReport_HappyPath — host report + peer observations +
storage proof results submitted by prober; storage-challenge-reports query
confirms indexed.
T3 TestLEP6_SubmitStorageRecheckEvidence_UpdatesSuspicionScore — INVALID_TRANSCRIPT
seed -> RECHECK_CONFIRMED_FAIL; verifies +15 NodeSuspicion delta and +8
TicketDeterioration delta (LEP-6 spec scoring constants).
T4 TestLEP6_HealOpLifecycle_ClaimVerifyFinalize — drives ticket deterioration
past heal_threshold via repeated rechecks, EndBlock creates heal op,
claim-heal-complete + submit-heal-verification x2 -> HEAL_OP_STATUS_VERIFIED.
Files:
- new: devnet/tests/lep6/lep6_test.sh (1062 lines, bash)
- new: docs/plans/LEP6_DEVNET_TEST_PLAN.md
- mod: devnet/default-config/devnet-genesis.json (5 LEP-6 param tweaks for
testability: divisor=1, mode=SOFT, postpone_threshold=100, decay=1000
(no decay), consecutive_epochs_to_postpone=100; epoch_length_blocks=20
was already present)
- mod: Makefile.devnet (devnet-tests-lep6 target + .PHONY)
- mod: devnet/Readme.md (§6.7 table — adds everlight + lep6 rows)
Self-bootstrap registration: devnet bundle ships only lumerad + libwasmvm
(no supernode binary), so supernode-setup.sh skips. Test self-registers each
validator's own key as a supernode if no SNs exist (mirrors
everlight_test.sh:ensure_supernode_registered_for_service).
Validation: 15/15 PASS, 0 FAIL, 0 SKIP against live 5-validator devnet
(2026-05-03). All LEP-6 spec scoring constants verified end-to-end under
real BFT consensus.
Out of scope (follow-up PRs):
- Full supernode<->lumera e2e (gated on supernode #286/#287/#288 merge)
- Postponement / recovery / edge cases (already covered in
tests/systemtests/audit_storage_truth_*)
- devnet-upgrade-1120 rehearsal target (separate concern)
Note for everlight_test.sh maintainer: S7.7's 'set epoch_length_blocks=20
via gov' is now redundant — the param became genesis-immutable upstream and
this PR makes the genesis ship at 20.
Run with:
make devnet-up-detach
make devnet-tests-lep6
2e9ec94 to
b63537d
Compare
mateeullahmalik
left a comment
There was a problem hiding this comment.
Code Review — PR #136 (LEP-6 chain-side devnet e2e)
Verdict: APPROVE with non-blocking notes. This is a test-only / devnet-config PR: no chain code, no keeper, no proto, no msg-server, no migration. State machine surface area is zero, so the standard chain-guardrails risks (determinism, gas, IBC, upgrade) do not apply. The test script itself is well-engineered — bootstrap idempotency, sequence-mismatch retry, epoch-boundary handling, prober/healer/verifier lookups from on-chain state, and wait_for_tx-with-distinct-rc-for-DeliverTx-failure are all done correctly. The one thing in this PR that isn't test-only — the global devnet-genesis param tweaks — is what reviewers should look at twice.
🟡 Notes (non-blocking, worth addressing before merge or in a follow-up)
N1 — Devnet-genesis param changes are global, not LEP-6-scoped. devnet/default-config/devnet-genesis.json and devnet-genesis-evm.json are loaded by every devnet bring-up (make devnet-up, make devnet-tests-everlight, future devnet-driven CI). Two changes here meaningfully alter chain-side behavior on every devnet run, not just make devnet-tests-lep6:
storage_truth_challenge_target_divisor: 3 → 1. Spec §6.1:challenge_target_count = max(1, ceil(active_supernodes / divisor)). On a 5-SN devnet this triples the per-epoch storage-challenge target fan-out (from 2 to 5 targets per prober) and triples theSubmitEpochReportpayload size every epoch, regardless of which test suite is running. Fine for testability, but please call this out in the PR description as "global devnet default change" — it isn't only a LEP-6 knob.storage_truth_ticket_deterioration_heal_threshold: 50 → 8. Spec default is 50; this is a test-time-only fast-fail. The test already reads the live param (T4.heal_threshold) and adapts the loop iteration count, so the test would work at 50 too — it would just take longer. Keeping8is reasonable for devnet, just be explicit it deviates from the production spec constant.
Both are documented in docs/plans/LEP6_DEVNET_TEST_PLAN.md lines 47–48, 73, but the PR description body itself doesn't mention them. Adding a one-line "Genesis param deviations for testability: divisor=1 (was 3), heal_threshold=8 (was 50)" to the PR body would make the diff self-describing.
N2 — EVM_CUTOVER_VERSION ?= v1.20.0 → v1.12.0 is unrelated to LEP-6. Makefile.devnet:84. This belongs in a separate PR (or at least an explicit line item in the description) — it affects _devnet-select-default-genesis and changes which genesis template devnet-build-version picks for every VERSION in [v1.12.0, v1.20.0). Easy to miss in a diff review titled "LEP-6 devnet tests." Suggest splitting or annotating.
N3 — max_multisig_sub_keys: 20 added to both devnet-genesis files. Master HEAD's devnet-genesis files do not contain this field; x/evmigration/types/params.go::Validate() rejects MaxMultisigSubKeys == 0. If the field was missing at genesis on master, evmigration InitGenesis would already be failing — so either (a) proto-3 defaulting + a downstream defaulter is filling it, or (b) master is broken for fresh devnet bring-up today. Either way, this addition is correct but unrelated to LEP-6 and should be flagged in the description (or filed as a separate "fix devnet genesis evmigration params" PR). The fix itself is fine.
N4 — Devnet test target is not wired into CI. Same as devnet-tests-everlight — manual-only. Fine and consistent with existing precedent, just worth confirming that the LEP-6 lifecycle gate will run as part of pre-release rehearsal (Upgrade-Testing-Guide), otherwise the test will silently rot the next time LEP-6 chain code drifts.
✅ Looks good
- Bootstrap discipline.
submit_bootstrap_host_reportsis correctly idempotent against(epoch, acc)via the on-chainaudit epoch-reportexistence check, AND pins the epoch up front to avoid mid-loop epoch straddles — that's the exact failure mode that bit me oneverlight_test.sh. The optional skip-list for the prober's own account before its full report is a thoughtful race guard. - Heal-eligibility predicate is driven from live chain state, not hardcoded. The loop at
lep6_test.sh:1379readsrecent_failure_count,distinct_holder_failure_count,last_index_failure_epochand gates on(distinct_holders >= 2 || last_index_failure_epoch > 0 || recent_failures >= 2)— that's exactly the chain's scheduling eligibility predicate (storage_truth_heal_ops.go), so the test stays correct if the genesisheal_thresholdever changes. - Healer + verifier identities come from
heal_op.healer_supernode_account/verifier_supernode_accounts, not test-side picking. This is what makes the lifecycle test robust against the chain's deterministic-singleton-healer selection logic (§18). - Authorization tests (T5, T6, T7) assert specific rejection substrings (
"creator must be independent...","challenged_supernode_account must not equal creator","not found","verification already submitted by creator") — not justcode != 0. That distinction matters because it pins the contract, not just the outcome. wait_for_txrc distinguishes DeliverTx failure (rc=2) from timeout (rc=1) — every downstream caller honors this distinction. Good plumbing.expect_tx_rejected_with_retrycorrectly handles the CheckTx-vs-DeliverTx asymmetry (CheckTx code != 0 short-circuits; otherwise wait for inclusion and check DeliverTx). This is the right shape — many devnet bash suites get this wrong and end up flaky.- LEP-6 spec scoring constants asserted exactly —
+15node suspicion,+8ticket deterioration in T3. These are the calibration substrate constants from §14/§16; pinning them in an integration test is exactly what was missing.
Minor (drive-by, fix if cheap, otherwise ignore)
lep6_test.sh:1015,:1039, etc.:seed_inclusion=$(wait_for_tx "$seed_txhash" >/dev/null; echo $?)— works, but the subshell pattern discards the JSON. Where you do need the JSON (e.g. line 230 inexpect_tx_rejected_with_retry), you correctly capture it; in the lifecycle test you don't, so this is fine. Just noting the pattern is asymmetric across the file.- T4 hardcodes
HEAL_OP_TIMEOUT_SEC=600. Withconsecutive_epochs_to_postpone=1and the host-report-refresh cadence inwait_for_next_epoch, this should be plenty, but consider exposing as an env var like the other knobs. proof_result_jsonalways usesartifact_class=STORAGE_PROOF_ARTIFACT_CLASS_INDEX. That's deliberate (drives Class-A escalation and index-failure heal eligibility), but a comment in the helper saying so would help future maintainers.
Out-of-scope / addressed in plan
The PR description's "Out of scope" list is honest and accurate: full SN↔chain e2e is correctly gated on supernode #286/#287/#288, and the systemtests suite already covers postponement/recovery — no need to duplicate. The plan doc also correctly identifies the everlight_test.sh S7.7 "set epoch_length_blocks via gov" path as now-redundant given the param is genesis-immutable; that's a courtesy heads-up for that suite's maintainer and not this PR's concern.
Recommendation: Approve. Address N1/N2/N3 by either (a) adding a one-paragraph "Genesis & Makefile non-LEP-6 changes bundled here" section to the PR description, or (b) extracting them into a tiny follow-up PR. The actual LEP-6 test additions are production-grade.
— Zee
Closes the missing devnet-test gap for LEP-6 storage-truth (Anton: 'LEP-6 doesn't even have devnet tests'). Chain-side lifecycle e2e under the live 5-validator Docker devnet, modeled on devnet/tests/everlight/everlight_test.sh.
Tests (4, all chain-side, no supernode runtime dependency):
T1 TestLEP6_ParamsAndEpochAnchor — params readback, divisor/mode/epoch_length
sanity, current-epoch-anchor query, active SN set check.
T2 TestLEP6_SubmitEpochReport_HappyPath — host report + peer observations +
storage proof results submitted by prober; storage-challenge-reports query
confirms indexed.
T3 TestLEP6_SubmitStorageRecheckEvidence_UpdatesSuspicionScore — INVALID_TRANSCRIPT
seed -> RECHECK_CONFIRMED_FAIL; verifies +15 NodeSuspicion delta and +8
TicketDeterioration delta (LEP-6 spec scoring constants).
T4 TestLEP6_HealOpLifecycle_ClaimVerifyFinalize — drives ticket deterioration
past heal_threshold via repeated rechecks, EndBlock creates heal op,
claim-heal-complete + submit-heal-verification x2 -> HEAL_OP_STATUS_VERIFIED.
Files:
Self-bootstrap registration: devnet bundle ships only lumerad + libwasmvm (no supernode binary), so supernode-setup.sh skips. Test self-registers each validator's own key as a supernode if no SNs exist (mirrors everlight_test.sh:ensure_supernode_registered_for_service).
Validation: 15/15 PASS, 0 FAIL, 0 SKIP against live 5-validator devnet (2026-05-03). All LEP-6 spec scoring constants verified end-to-end under real BFT consensus.
Out of scope (follow-up PRs):
Note for everlight_test.sh maintainer: S7.7's 'set epoch_length_blocks=20 via gov' is now redundant — the param became genesis-immutable upstream and this PR makes the genesis ship at 20.
Run with:
make devnet-up-detach
make devnet-tests-lep6