fix(gateway): s6 lifecycle supervision + container_environment API key#32
fix(gateway): s6 lifecycle supervision + container_environment API key#325kahoisaac wants to merge 2 commits into
Conversation
af3ee89 to
5b35cf9
Compare
|
Thanks for splitting this out. I’m going to hold off on merging this one for now because it changes the highest-risk runtime path: gateway supervision, s6 handoff, container environment propagation, shutdown behavior, and Docker defaults. A couple of concerns I’d like resolved first:
Could you add live Docker/HF evidence for this PR specifically? I’d want to see logs showing:
The direction may be right, but I want runtime proof before merging this one. |
|
Addressed both concerns:
Regarding the live Docker/HF evidence request — I don't have a running HF Space to attach logs from, but if you can share one I'm happy to validate there. Alternatively if you want to run it locally the |
|
Thanks for updating this. The two concerns I raised earlier look addressed now:
One blocker remains: this PR is no longer mergeable against current Could you rebase/update this PR on the latest After that, I still want runtime evidence before merging because this changes the gateway supervision path. Please include logs showing:
Once it is up to date and has that Docker/HF runtime proof, I’m open to merging it. |
…nt API key - start.sh: replace PID-based gateway supervision with a health-endpoint monitor loop; use `hermes gateway run/restart` (s6 hand-off aware), add wait_for_port_free, graceful CLI-based shutdown, and HERMES_GATEWAY_NO_SUPERVISE export. - Dockerfile: ARG HERMES_AGENT_VERSION (default at FROM), API_SERVER_* as Docker ENV (read from s6 container_environment), and COPY the cont-init.d hook. - cont-init.d/016-huggingmes-api-server-key: alias GATEWAY_TOKEN -> API_SERVER_KEY in the gateway's container_environment so its API server can bind 8642. - docker-compose.yml: HERMES_GATEWAY_NO_SUPERVISE default.
…PI_SERVER_KEY
- Dockerfile: use ${HERMES_AGENT_VERSION:-latest} in ENV so the runtime
variable is never empty when no --build-arg is supplied (preserves the
same default already used in FROM).
- cont-init.d/016-huggingmes-api-server-key: when GATEWAY_TOKEN is absent
generate an ephemeral API_SERVER_KEY directly in container_environment so
the s6-supervised gateway can start its API server. Previously the key was
only generated in start.sh (which runs after cont-init), creating a race
where the gateway would launch without the key and refuse to bind port 8642.
14ee06e to
1202878
Compare
|
Rebased onto current |
Split of #26 (part 4/5).
Gateway / s6 lifecycle +
container_environmentAPI keystart.sh: replace PID-based gateway supervision with a health-endpoint monitor loop; usehermes gateway run/restart(s6 hand-off aware), addwait_for_port_free, graceful CLI-based shutdown, and theHERMES_GATEWAY_NO_SUPERVISEexport. Fixes the s6-supervise restart storm / shutdown hang.Dockerfile:ARG HERMES_AGENT_VERSIONwith the default applied atFROM(:-latest);API_SERVER_*as Docker ENV (read from s6container_environment, notstart.shexports); COPY the cont-init hook.cont-init.d/016-huggingmes-api-server-key: aliasGATEWAY_TOKEN→API_SERVER_KEYin the gateway'scontainer_environmentso its API server can bind8642.docker-compose.yml:HERMES_GATEWAY_NO_SUPERVISEdefault.Test plan
API_SERVER_*reach the s6-supervised gateway;8642binds; dashboard shows Gateway: Online.