Skip to content

Monotonic time & prep for router graceful retirement#10718

Open
NomDeTom wants to merge 4 commits into
meshtastic:developfrom
NomDeTom:monotonic-time
Open

Monotonic time & prep for router graceful retirement#10718
NomDeTom wants to merge 4 commits into
meshtastic:developfrom
NomDeTom:monotonic-time

Conversation

@NomDeTom

Copy link
Copy Markdown
Collaborator

Three related infrastructure workstreams on one branch (off develop @ 8a0c7592c):

  1. Monotonic time abstraction — one injectable, rollover-immune clock seam, with every
    millis() call site in src/ swept onto it.
  2. Standardised local test runnerbin/run-tests.sh, a single RED/AMBER/GREEN verdict so a
    broken suite is caught locally, not in CI.
  3. Router retirement slope (opt-in, default OFF) — an unattended router gradually demotes
    itself so abandoned infrastructure stops clogging dense meshes.

All three are either behaviour-neutral or default-off. ~129 files, mostly the mechanical clock sweep.

1 · Monotonic time (src/Time.{h,cpp})

A small root-level utility (mirrors memGet), hosted via configuration.h:

  • getMillis() — drop-in for millis() (rollover-safe with the usual subtraction idiom).
  • getMillis64() — 64-bit monotonic ms; immune to the 49.7-day uint32_t wrap.
  • Test seam: under PIO_UNIT_TESTING, a settable virtual clock (Time::setTestMillis() /
    advanceTestMillis()) — default OFF, so suites using real time are unaffected. One seam for all
    suites, replacing the ad-hoc per-module s_testNowMs clocks.

Every millis() in src/ is swept to getMillis() except the providers themselves (platform
defs, USBHal.h override) and the seam. The two existing module clocks (HopScaling,
TrafficManagement) now delegate to Time::.

2 · bin/run-tests.sh

The authoritative local test verdict. Runs the native suites, then:

  • matches every pass/fail spelling ([PASSED]/:PASS, [FAILED]/:FAIL:, [ERRORED],
    error:, crash signatures) — a positive summary is required, not just absence of failed;
  • cross-checks the number of suites that ran against ls test/test_*/AMBER if one silently
    goes missing (SKIPPED-aware);
  • emits one machine-readable line (RESULT: GREEN N/N / AMBER ran X/N missing: … / RED …).

Deliberately not wired to CI's exact invocation — it's the local pre-flight.

3 · Router retirement slope (default OFF)

A persisted "managed-uptime credit" accrues with uptime and resets on any authenticated admin
session
(remote over-mesh or local USB/BLE — any management channel counts as "still
tended"). At the threshold (~3 months) the node demotes one rung ROUTER → ROUTER_LATE → CLIENT
and resets the credit; CLIENT is the floor (no-op). Demotion reuses the existing role-change path
(installRoleDefaults + saveChanges + reboot).

Policy is pure static helpers (fully unit-tested); device wiring is an OSThread registered behind
the config flag. Default OFF — no behaviour change unless explicitly enabled.

Testing

  • New test/test_time/ — proves getMillis64() crosses 0xFFFFFFFF correctly under injection.
  • New test/test_router_retirement/ — credit accrual, remote/local admin reset, the full slope,
    threshold boundary, CLIENT no-op, disabled-by-default (uses the WS1 time seam to fast-forward
    3-month windows with no real sleep).
  • ./bin/run-tests.sh → GREEN; pio run -e native links.

⚠️ Caveats / deferred

  • Proto is hand-stubbed. Router retirement adds fields to DeviceState
    (router_retirement_credit_secs) and a RouterRetirementConfig to module_config, hand-edited
    directly in the generated *.pb.{h,cpp}. This branch will not build for others until the
    upstream meshtastic/protobufs PR lands and the submodule is bumped
    (same caveat pattern as
    PR1's snr_q4). The .proto source change + regeneration is a follow-up.
  • Unsafe millis() rollover comparisons at Power.cpp:806/811 and GPS.cpp:1532 were left
    rename-only — the uint32 -1 sentinel makes the subtraction-idiom conversion risky without
    test coverage. Convert later with tests; getMillis64() is available.
  • Router-retirement runtime path (runOnce accrual + demote/reboot) is unit-tested at the
    policy-helper level only; exercise the full demote on bench/sim before relying on it.

Commits

  • 0ec742c0f — monotonic time: central Time:: seam + full millis() sweep + test runner
  • 42c99c6e4 — run-tests.sh: match all pass/fail spellings; SKIPPED-aware count check
  • 78f7cb3ad — router retirement slope (ROUTER → ROUTER_LATE → CLIENT)

🤖 Generated with Claude Code

🤝 Attestations

  • I have tested that my proposed changes behave as described.
  • I have tested that my proposed changes do not cause any obvious regressions on the following devices:
    • Heltec (Lora32) V3
    • LilyGo T-Deck
    • LilyGo T-Beam
    • RAK WisBlock 4631
    • Seeed Studio T-1000E tracker card
    • Other (please specify below)

NomDeTom and others added 4 commits June 14, 2026 19:13
WS2 — Monotonic uptime abstraction:
- New src/Time.{h,cpp}: getMillis() (drop-in for millis()) and getMillis64()
  (rollover-immune 64-bit, for >49.7-day durations). Test injection via
  Time::setTestMillis()/advanceTestMillis(), default OFF so suites relying on
  real time are unaffected. Hosted via configuration.h so it's available
  codebase-wide without per-file includes.
- Swept all ~485 millis() call sites across src/ to Time::getMillis(), except
  the providers (platform millis() definitions, USBHal override) and the seam
  itself. HopScalingModule's production clock now routes through Time::.
- Added explicit Time.h includes to the ~19 files that didn't transitively
  reach configuration.h.

WS3 — bin/run-tests.sh: runs the native coverage suite and emits one
RED/AMBER/GREEN verdict with a canonical suite-count cross-check (AMBER if
fewer suites ran than exist under test/), per .notes/test-passfail-filter.md.

New test/test_time/ covers injection + 64-bit rollover across 0xFFFFFFFF.

Verified: native `pio run -e native` links clean; `bin/run-tests.sh` => GREEN
20/20 suites, 342/342 cases (incl. test_time, hop_scaling, traffic_management).

Deferred (recorded): unsafe millis() deadline comparisons in Power.cpp /
GPS.cpp left as rename-only (uint32 sentinel hazard) — to convert with test
coverage in a follow-up.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The same outcome is spelled differently by each layer (Unity :PASS/:FAIL: per
assertion, pio [PASSED]/[FAILED]/[ERRORED] per suite, "N succeeded"/"M failed"
summary). Grepping one spelling misses the others and yields a false verdict
(the trap that produced earlier false greens). FAIL_RE/PASS_RE now alternate
over every spelling. The canonical suite cross-check now treats pio-SKIPPED
suites as accounted-for so hardware-only skips don't false-AMBER.

Deliberately not aligned to the CI invocation — this is the local/agent verdict
tool; CI keeps its JUnit path. Validated against captured logs + a live run.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Auto-demote an unattended infrastructure node one rung per ~3 months of
cumulative uptime with no admin session. Opt-in (default OFF).

- Proto (hand-stubbed in generated nanopb; .proto/regeneration deferred to the
  upstream protobufs PR — see .notes/router-retirement-proto.md):
  * DeviceState.router_retirement_credit_secs (persisted credit, tag 14)
  * ModuleConfig.RouterRetirementConfig {enabled, step_threshold_secs} oneof
    member (tag 17) + the LocalModuleConfig runtime field (tag 18)
- RouterRetirementModule (OSThread): hourly credit accrual; on threshold,
  demote one rung via installRoleDefaults + saveToDisk + reboot. Policy split
  into pure static helpers (isRetirableRole / nextRetirementRole /
  effectiveThresholdSecs / shouldRetire) for unit testing without globals.
- AdminModule: any admin session (local OR remote) resets the credit.
- test/test_router_retirement/: 12 cases over the policy helpers.

Validated: pio run -e native links clean; bin/run-tests.sh => GREEN 21/21
suites, 354/354 cases (incl. new test_router_retirement + test_time).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@NomDeTom NomDeTom added enhancement New feature or request ai-generated Possible AI-generated low-quality content needs-tacos Every night can be taco night labels Jun 15, 2026
@github-actions github-actions Bot added the needs-review Needs human review label Jun 15, 2026
@github-actions

github-actions Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

⚡ Try this PR in the Web Flasher

Flash this PR in the Web Flasher

firmware commit boards expires

Warning

This is an automated, unreviewed CI test build. Back up your device configuration
before flashing, and only flash devices you are able to recover.

Supported boards built by this PR (24)
Device Board Platform
Crowpanel Adv 3.5 TFT elecrow-adv-35-tft esp32-s3
Heltec HT62 heltec-ht62-esp32c3-sx1262 esp32-c3
Heltec Mesh Node 096 heltec-mesh-node-t096 nrf52840
Heltec Mesh Node T1 heltec-mesh-node-t1 nrf52840
Heltec Mesh Node T114 heltec-mesh-node-t114 nrf52840
Heltec V3 heltec-v3 esp32-s3
Heltec V4 heltec-v4 esp32-s3
Raspberry Pi Pico pico rp2040
Raspberry Pi Pico W picow rp2040
RAK WisMesh Tag rak_wismeshtag nrf52840
RAK WisBlock 11200 rak11200 esp32
RAK WisBlock 11310 rak11310 rp2040
RAK3312 rak3312 esp32-s3
RAK WisBlock 4631 rak4631 nrf52840
Seeed Wio Tracker L1 seeed_wio_tracker_L1 nrf52840
Seeed Xiao NRF52840 Kit seeed_xiao_nrf52840_kit nrf52840
Seeed Xiao ESP32-S3 seeed-xiao-s3 esp32-s3
Station G2 station-g2 esp32-s3
Station G3 station-g3 esp32-s3
LILYGO T-Deck t-deck-tft esp32-s3
LILYGO T-Echo t-echo nrf52840
LILYGO T-Echo Plus t-echo-plus nrf52840
LilyGo T3-C6 tlora-c6 esp32-c6
Seeed SenseCAP T1000-E tracker-t1000-e nrf52840

Build artifacts expire on 2026-07-15. Updated for 243ea6a.

@NomDeTom NomDeTom requested a review from thebentern June 15, 2026 01:36
@github-actions

Copy link
Copy Markdown
Contributor

Firmware Size Report

22 targets | vs develop: 22 increased, net +19,792 (+19.3 KB)

Target Size vs develop
t-deck-tft 3,782,640 📈 +1,136 (+1.1 KB)
picow 1,221,772 📈 +1,024 (+1.0 KB)
rak11310 784,064 📈 +976
pico 761,464 📈 +968
seeed_xiao_rp2040 759,664 📈 +968
Show 17 more target(s)
Target Size vs develop
rak11200 1,831,632 📈 +960
tlora-c6 2,341,360 📈 +944
heltec-v3 2,235,856 📈 +928
heltec-vision-master-e213-inkhud 2,195,984 📈 +928
t-eth-elite 2,462,496 📈 +928
elecrow-adv-35-tft 3,388,528 📈 +912
pico2w 1,198,188 📈 +908
heltec-v4 2,256,368 📈 +896
station-g3 2,237,712 📈 +896
pico2 749,272 📈 +880
station-g2 2,247,280 📈 +880
seeed_xiao_rp2350 747,416 📈 +872
heltec-ht62-esp32c3-sx1262 2,107,664 📈 +864
rak3312 2,242,736 📈 +800
seeed-xiao-s3 2,246,736 📈 +800
rak3172 180,968 📈 +668
wio-e5 233,332 📈 +656

Updated for a995220

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai-generated Possible AI-generated low-quality content enhancement New feature or request needs-review Needs human review needs-tacos Every night can be taco night

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant