fix: stream group_agent_export instead of buffering the whole group by agent-p1p · Pull Request #112 · marmot-protocol/goggles

agent-p1p · 2026-06-24T06:12:38Z

Closes #109

Summary

group_agent_export was the last unbounded full-group path: it materialized every valid event (including the per-event raw_context/raw_kind JSONField blobs) into a Python list, built one nested export dict, then serialized one large JSON string — all held in a single gunicorn worker's heap before the response was written. For a large group (50 MiB/file upload ceiling × many files) this could spike a worker's RSS into the hundreds of MB/GB and block or OOM it for all users — the same memory-exhaustion class the bounded landing (#17), detail (#38), tabs (#92), and timeline (#38/#94) paths were introduced to avoid.

Per the issue's design guidance, the export is the authoritative, complete whole-group artifact, so capping/truncating is wrong — completeness is the point. This streams the response instead, keeping memory roughly per-chunk + the bounded aggregate sections rather than per-group, with no cap and no truncation.

Approach

"Complete" and "buffer the whole thing in one heap" are separable. The fix splits the export into two passes:

Bounded header (agent_state_export_header_for_group) — every section except the per-event array (group, summary, sources, timeline, actions, messages, sensitivity). These aggregate sections are bounded by engine/message/action cardinality, not raw event count, and — crucially — none of them read raw_context/raw_kind (only agent_event_row does). So the header is built from events fetched without the export-only JSON blobs (include_export_fields=False), i.e. the same lean rows the existing bounded views already load.
Lazy per-event generator (agent_event_rows_for_group) — streams agent_event_row dicts (these do carry the raw JSON) from the DB via .iterator(chunk_size=2_000) (the chunked-iteration pattern already used in this module). Ordered in SQL with F("wall_time_ms").asc(nulls_last=True), line_number, id to match sorted_timeline_events exactly, so rows stream in the final order without a Python re-sort that would force full materialization.

group_agent_export now returns a StreamingHttpResponse whose generator serializes the header once, replaces its closing brace with a streamed "events": [...] array, and serializes each event row individually — so the body is written incrementally and the export is never one giant in-memory JSON string. Both compact and ?pretty=1 modes are preserved.

Verification (local, full CI parity except docker postgres + dep-audit)

uv run python manage.py test — 158 passed (155 existing + 3 new regression tests), no regressions
uv run ruff check . — clean
uv run ruff format --check . — clean
uv run python manage.py check — no issues
uv run python manage.py makemigrations --check --dry-run — No changes detected (no model/migration drift)

GitHub Actions additionally covers the Postgres test run (verifies nulls_last ordering on Postgres, where NULL placement differs from SQLite) and dependency audit.

New regression tests

test_group_agent_export_streams_events_lazily_in_chunks — agent_event_rows_for_group is a lazy generator; view returns StreamingHttpResponse
test_group_agent_export_streams_complete_events_in_timeline_order — completeness (every valid event, no drop) + deterministic order matching the buffered sorted_timeline_events order, across two engines
test_group_agent_export_pretty_streams_valid_json — ?pretty=1 still yields valid, complete, indented JSON
Updated test_group_agent_export_returns_agent_readable_json to consume the streamed body

Sensitive paths

None. Touches forensics/views.py + forensics/analysis.py + forensics/tests.py only — no models, migrations, settings, auth, or token code (repo_sensitive_paths untouched; makemigrations --check confirms no DB drift).

coderabbitai · 2026-06-24T06:12:46Z

Warning

Review limit reached

@agent-p1p, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 38 minutes and 56 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: aafda3b4-11da-4ea3-ab6c-ce5e0865f40a

📥 Commits

Reviewing files that changed from the base of the PR and between 2591fe7 and 57d292a.

📒 Files selected for processing (4)

config/settings.py
forensics/analysis.py
forensics/tests.py
forensics/views.py

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch pip/goggles-109

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

stage-review · 2026-06-24T06:12:56Z

Ready to review this PR? Stage has broken it down into 4 individual chapters for you:

	Title
1	Define export event limit setting
2	Refactor export logic into header and streaming generator
3	Implement streaming response in export view
4	Verify streaming export and regression tests

_{Chapters generated by Stage for commit 57d292a on Jun 24, 2026 6:27am UTC.}

agent-p1p · 2026-06-24T06:16:34Z

Adversarial review

Verdict: Changes requested (1 blocking finding).

Blocking

forensics/views.py:426 / forensics/analysis.py:1061-1089 — the endpoint still eagerly builds a full-group header before the streaming response exists. header_events = list(valid_events_for_group(group)) loads every valid event, then agent_state_export_header_for_group() builds timeline_payload_for_group, human_action_groups_for_group, message_traces_from_events, and group_summary from that full list. timeline_payload_for_group() still includes timeline["items"] for every placed event and excluded["event_ids"] for every unplaced event, human_action_groups_for_group() can append an event_row for every human-action event, and _stream_agent_state_export() serializes that whole header with one json.dumps(header) before yielding any streamed event row. This removes raw_context/raw_kind from the header and streams the final events array, but it does not remove the unbounded full-group path the issue called out; a large group can still force the worker to hold the full event list plus full timeline/action/message structures plus the header JSON string. Fix by making the exported header bounded/streamed too, or by adding an explicit size guard/413 if the complete timeline/actions/messages header cannot be produced without O(group events) memory.

Suggestions

None.

Nitpicks

None.

Local check run during review: DJANGO_DEBUG=1 uv run python manage.py test forensics.tests.GroupDetailTimelineViewTests.test_group_agent_export_returns_agent_readable_json forensics.tests.GroupDetailTimelineViewTests.test_group_agent_export_streams_events_lazily_in_chunks forensics.tests.GroupDetailTimelineViewTests.test_group_agent_export_streams_complete_events_in_timeline_order forensics.tests.GroupDetailTimelineViewTests.test_group_agent_export_pretty_streams_valid_json — 4 passed.

The streaming response keeps the per-event array per-chunk, but the export header (timeline items, excluded ids, human-action rows) is inherently O(group events) and is still buffered to build the response. Gate the endpoint on the valid-event count (a single SQL COUNT, no rows materialized) via GOGGLES_AGENT_EXPORT_MAX_EVENTS so an oversized group fails predictably with a 413 + JSON error instead of spiking a worker's RSS and degrading it for all users. Completeness is preserved below the cap; <= 0 disables the guard. Addresses adversarial review on #112. Closes #109

agent-p1p · 2026-06-24T06:22:34Z

Addressed blocking finding (commit `05da568`)

The adversarial review correctly flagged that the export header is still O(group events): timeline.items places one item per placed event, timeline.excluded.event_ids one id per unplaced event, and actions can append an event_row per human-action event. Streaming the per-event array did not remove that path — it only moved the largest single component (full event rows + raw_context/raw_kind JSON) to per-chunk.

Per the issue's stated design tension ("completeness is the point" — a hard cap/truncation is wrong) and the review's own escape hatch ("add an explicit size guard/413 if the complete timeline/actions/messages cannot be produced without O(group events) memory"), I added a configurable event-count gate:

New setting GOGGLES_AGENT_EXPORT_MAX_EVENTS (default 250,000; <= 0 disables).
The view checks the valid-event count via a single SQL COUNT(*) (no event rows materialized) before building the header. If the group exceeds the cap, it returns a predictable 413 + JSON error ({error, event_count, max_events}) instead of letting an arbitrarily large group spike a worker's RSS.
Normal-sized groups (under the cap) still get the full, complete export — header built once, per-event array streamed chunk-by-chunk. No truncation below the cap.
Corrected the now-inaccurate agent_state_export_header_for_group docstring (the header IS O(events); the guard is what bounds the worst case).

Verification (CI-exact commands, local)

DJANGO_DEBUG=1 uv run python manage.py test — 160 passed (+2 new: 413-fires-over-cap, cap-disabled-streams-full).
uv run ruff check . — clean; uv run ruff format --check . — clean.
DJANGO_DEBUG=1 uv run python manage.py check — no issues; makemigrations --check --dry-run — no changes.

Note: config/settings.py is a repo-declared sensitive path (one additive config constant, no auth/token/secret logic) — flagged for the merge gate.

The agent-state export was the last unbounded full-group path: it materialized every valid event (with its raw_context/raw_kind JSON) into a Python list, built one nested export dict, and serialized one large JSON string -- all resident in a single gunicorn worker's heap. A large group could spike a worker's RSS and block/OOM it for all users, the same memory-exhaustion class the bounded landing/detail/ tabs/timeline paths were introduced to avoid. Stream the response instead, preserving completeness (no cap, no truncation): - Split the build into a bounded header (everything except the per-event array, built from events fetched WITHOUT the raw JSON blobs -- only agent_event_row reads those) and a lazy per-event generator (agent_event_rows_for_group) that pulls rows via .iterator(chunk_size= 2_000), ordered in SQL with nulls_last to match sorted_timeline_events without a full-materialization Python re-sort. - group_agent_export now returns a StreamingHttpResponse whose generator serializes the header once, then streams each event row individually, so memory stays roughly per-chunk + the bounded aggregate sections instead of per-group, and the body is written incrementally. Adds regression tests for lazy chunked streaming, completeness + deterministic timeline ordering, and the pretty-printed stream. Closes #109

The streaming response keeps the per-event array per-chunk, but the export header (timeline items, excluded ids, human-action rows) is inherently O(group events) and is still buffered to build the response. Gate the endpoint on the valid-event count (a single SQL COUNT, no rows materialized) via GOGGLES_AGENT_EXPORT_MAX_EVENTS so an oversized group fails predictably with a 413 + JSON error instead of spiking a worker's RSS and degrading it for all users. Completeness is preserved below the cap; <= 0 disables the guard. Addresses adversarial review on #112. Closes #109

agent-p1p added 2 commits June 24, 2026 08:26

agent-p1p force-pushed the pip/goggles-109 branch from 05da568 to 57d292a Compare June 24, 2026 06:27

agent-p1p marked this pull request as ready for review June 24, 2026 06:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: stream group_agent_export instead of buffering the whole group#112

fix: stream group_agent_export instead of buffering the whole group#112
agent-p1p wants to merge 2 commits into
masterfrom
pip/goggles-109

agent-p1p commented Jun 24, 2026 •

edited by stage-review Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 24, 2026 •

edited

Loading

Review limit reached

Uh oh!

stage-review Bot commented Jun 24, 2026 •

edited

Loading

Uh oh!

agent-p1p commented Jun 24, 2026

Uh oh!

agent-p1p commented Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

agent-p1p commented Jun 24, 2026 • edited by stage-review Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Approach

Verification (local, full CI parity except docker postgres + dep-audit)

New regression tests

Sensitive paths

Uh oh!

coderabbitai Bot commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Uh oh!

stage-review Bot commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

agent-p1p commented Jun 24, 2026

Adversarial review

Blocking

Suggestions

Nitpicks

Uh oh!

agent-p1p commented Jun 24, 2026

Addressed blocking finding (commit 05da568)

Verification (CI-exact commands, local)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

agent-p1p commented Jun 24, 2026 •

edited by stage-review Bot

Loading

coderabbitai Bot commented Jun 24, 2026 •

edited

Loading

stage-review Bot commented Jun 24, 2026 •

edited

Loading

Addressed blocking finding (commit `05da568`)