Skip to content

fix(synth): stream the synthesis call + idle-token deadline (#104)#106

Merged
askalf merged 1 commit into
masterfrom
fix/synth-stream-104
Jun 15, 2026
Merged

fix(synth): stream the synthesis call + idle-token deadline (#104)#106
askalf merged 1 commit into
masterfrom
fix/synth-stream-104

Conversation

@askalf

@askalf askalf commented Jun 15, 2026

Copy link
Copy Markdown
Owner

What

Synthesis now streams (even in --json/non-TTY mode) and the stream is bounded by an idle-token deadline. Fixes the dominant #104 failure mode.

Root cause (the #104 "fetch wedge" was a red herring)

A live --verbose trace caught it: factual-lookup fetched its sources fine, then aborted in synthesis. Its answer is large (~8k tokens, ~110–150s to generate). The non-streaming client (callLLM) waits for the whole response under a 120s whole-call timeout, so it intermittently timed out mid-generation and re-ran the full generation (~360s) before failing — surfacing as "0 sources / aborted due to timeout" (0 sources only because the run died before emitting the JSON envelope).

Fix

  • synthesize.ts — always use callLLMStream (it accumulates and returns when there's no onToken). The streaming client bounds only the connect by the timeout, so a long-but-healthy stream completes in one pass (~130s, ~3× faster) instead of tripping the whole-call cap.
  • llm-stream.ts — add an idle-token deadline to the stream read (parseSSE idleMs): no token for timeoutMs cancels the stream and throws, so a genuine stall fails fast instead of hanging to the global --max-runtime (or forever, when unset — closing a pre-existing gap in the interactive path). Retry still wraps the connect only; a mid-stream synthesis stall is empirically a persistent upstream condition, so re-issuing just burns 3× the wall-clock.

Tests

  • parseSSE: idle timeout aborts a stalled stream → TimeoutError; a prompt stream with idleMs set completes with no false timeout.
  • callLLMStream: a mid-stream stall surfaces (connect-only retry, no silent re-issue).
  • agent-loop mock now emits SSE for streaming synth calls.
  • Full suite green (723, 0 fail).

Validation (e2e through dario)

factual-lookup now passes in ~130s (was ~408s on the slow path / ~50% whole-call timeout).

Honest residual: an intermittent upstream SSE stall remains (~1 in 4 runs) — the stream stops mid-generation and the idle deadline fails it fast. Adding whole-call retry for this case was tried and reverted: the stall persists across the retry window, so it only made failures slower (~400s) with no better pass rate. This residual is tracked in #104 and is likely round-trip/tunnel aggravated (these runs reach dario over an SSH tunnel); it'll be retested with deepdive running in-network next to dario.

factual-lookup's synthesis generates a large (~8k-token) answer that takes
~110-150s. The non-streaming client waits for the whole response under a 120s
whole-call timeout, so it intermittently timed out mid-generation and re-ran
the full generation 3x (~360s) before failing. That is the #104 wedge — it
surfaced as "0 sources / aborted due to timeout", but the fetch stage was a
red herring: the run had its sources and was in synth.

Route synthesis through the streaming client even in non-TTY/--json mode
(callLLMStream accumulates and returns when there's no onToken). The streaming
client bounds only the connect by the timeout, so a long-but-healthy stream
completes in one pass (~130s, ~3x faster than the old slow path).

Add an idle-token deadline to the stream read (parseSSE idleMs): no token for
timeoutMs cancels the stream and throws, so a genuine stall fails fast instead
of hanging to the global --max-runtime (or forever, when it's unset) — closing
a pre-existing gap in the interactive path too.

Retry still wraps the connect only. A mid-stream synthesis stall is, empirically
(#104), a persistent upstream condition that re-issuing doesn't recover from, so
re-streaming just burns 3x the wall-clock; we fail fast instead.

Tests: parseSSE idle-timeout (stall -> TimeoutError; prompt stream -> no false
timeout) + a mid-stream-stall-surfaces test; the agent-loop mock now speaks SSE
for streaming synth calls. Full suite green.

Validated e2e through dario: factual-lookup now passes in ~130s (was ~408s on
the slow path / ~50% whole-call timeout). A residual intermittent upstream SSE
stall remains (~1 in 4) — tracked in #104; likely round-trip/tunnel aggravated,
to be retested with deepdive running in-network next to dario.
@askalf askalf merged commit 92ec973 into master Jun 15, 2026
5 checks passed
@askalf askalf deleted the fix/synth-stream-104 branch June 15, 2026 16:08
askalf added a commit that referenced this pull request Jun 15, 2026
)

Ships the synthesis-reliability fix to npm: synthesis always streams (even
in --json / non-TTY) bounded by an idle-token deadline, so a long generation
completes in one pass and a stalled stream fails fast instead of burning 3x
whole-call retries.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant