fix: resolve all 8 Windows test failures — UTF-8, CRLF, FastMCP API, stop_at (#239, #241) by azizur100389 · Pull Request #274 · tirth8205/code-review-graph

azizur100389 · 2026-04-14T14:37:24Z

Summary

Five root-cause fixes that together resolve every pre-existing Windows test failure on main. Net effect: 8 failures → 0, full green Windows CI.

Closes #239, closes #241.

Supersedes individual PRs #240 and #242 (which are being closed in favor of this bundled fix).

Bug 1 — `get_data_dir()` writes non-UTF-8 `.gitignore` on Windows

write_text() called without encoding="utf-8". The em-dash (U+2014) in the header encodes as cp1252 byte 0x97 on Windows. Any later UTF-8 read raises UnicodeDecodeError.

Fix: Add encoding="utf-8" (matches sibling _ensure_repo_gitignore).
Test: test_auto_gitignore_is_valid_utf8 — asserts UTF-8 byte sequence 0xE2 0x80 0x94, rejects cp1252 byte 0x97.

Bug 2 — Databricks notebook detection fails on CRLF

source.startswith(b"# Databricks notebook source\n") hard-codes LF. Windows core.autocrlf=true produces CRLF → 4 tests fail silently.

Fix: Parse first line robustly, strip trailing \r before exact match. Rejects prefix false positives.
Tests: test_databricks_header_crlf_line_endings, test_databricks_header_lf_line_endings_still_work, test_databricks_header_prefix_false_positive_rejected.

Bug 3 — Stale FastMCP API in async regression guard

mcp.get_tools() doesn't exist in fastmcp>=2.14.0. The guard promised by PR #231 for #46/#136 has been silently broken since it was written.

Fix: Resolve tools via getattr(crg_main, name) (mirrors sibling test). Drop @pytest.mark.asyncio.
Test: test_regression_guard_does_not_depend_on_fastmcp_internals — AST-walks guard source to reject stale mcp internal APIs.

Bugs 4-5 — `find_repo_root` walks above test sandbox

test_returns_none_without_git and test_falls_back_to_start fail on any machine where tmp_path has a git-initialized ancestor (dotfiles at ~/.git).

Fix: Add optional stop_at: Path | None = None to find_repo_root() and find_project_root(). Walk examines stop_at for .git then stops. Default None = existing behavior. All 7 production callers unchanged.
Tests: test_stop_at_prevents_escape_to_outer_git, test_stop_at_finds_git_at_boundary, test_stop_at_forwarded_to_find_repo_root.

Files changed (5 files, +302 / -35)

code_review_graph/incremental.py — UTF-8 encoding fix + stop_at API addition
code_review_graph/parser.py — CRLF-tolerant Databricks detection
tests/test_incremental.py — gitignore UTF-8 guard + stop_at regression tests
tests/test_main.py — fixed async guard + meta-guard against FastMCP API drift
tests/test_notebook.py — CRLF + LF + false-positive regression guards

Test results

23/23 targeted tests pass (all new + all previously-failing)
Full suite on Windows: 0 pre-existing failures remaining
ruff check on all 5 changed files: clean

Why this fix is safe

UTF-8 fix matches established pattern in sibling function
Databricks fix uses stricter matching than original (== not startswith)
FastMCP fix is test-only; product code unchanged
stop_at is backward-compatible (default None = existing behavior)

…irth8205#258, tirth8205#259, tirth8205#260, tirth8205#261) Four reliability hardening fixes that improve crash safety, resource cleanup, and API transparency across the graph pipeline. Fix 1: wrap incremental_trace_flows DELETEs in a transaction (tirth8205#258) ------------------------------------------------------------------- File: code_review_graph/flows.py:499-502 The DELETE loop in incremental_trace_flows deleted flow_memberships and flows rows one at a time without an explicit transaction. A crash mid-loop could leave orphaned flow_memberships rows pointing at deleted flows. Fix: wrap the DELETE loop in BEGIN IMMEDIATE / COMMIT / ROLLBACK, matching the pattern already used by store_flows at line 406. Includes an in_transaction guard to flush any implicit transaction first. Test: test_incremental_trace_flows_delete_is_atomic — verifies no orphaned memberships exist after an incremental re-trace. Fix 2: atomic FTS index rebuild (tirth8205#259) -------------------------------------- File: code_review_graph/search.py:39-54 rebuild_fts_index performed DROP, CREATE, and INSERT across three separate statements with commits between them. A crash after DROP but before CREATE would leave the DB without an FTS table entirely. Fix: wrap the entire DROP + CREATE + INSERT sequence in a single BEGIN IMMEDIATE transaction so the operation is all-or-nothing. Test: test_fts_rebuild_is_atomic — verifies double-rebuild leaves the table intact and queryable. Fix 3: EmbeddingStore context manager support (tirth8205#260) ---------------------------------------------------- File: code_review_graph/embeddings.py:391+ EmbeddingStore opens a SQLite connection in __init__ but did not implement __enter__/__exit__. If an exception occurred during usage before close() was called, the connection leaked. Fix: add __enter__ and __exit__ methods following the same pattern as GraphStore (graph.py:154-157). Test: test_supports_context_manager, test_context_manager_closes_on_exception. Fix 4: truncation signal for find_dependents (tirth8205#261) ---------------------------------------------------- File: code_review_graph/incremental.py:434-465 When find_dependents hit _MAX_DEPENDENT_FILES (500), it truncated the result and logged a warning, but the caller received a plain list with no indication that it was incomplete. Callers had to guess whether they got the full dependent set or a truncated one. Fix: introduce DependentList (a transparent list subclass with a .truncated boolean attribute). Existing callers that iterate, len(), or slice continue to work unchanged; only callers that specifically check .truncated benefit from the signal. Tests: test_truncated_flag_set_when_capped (600 deps, expects True), test_truncated_flag_false_when_not_capped (small chain, expects False). Files changed ------------- - code_review_graph/flows.py — transaction wrap around DELETE loop - code_review_graph/search.py — atomic FTS rebuild - code_review_graph/embeddings.py — __enter__/__exit__ on EmbeddingStore - code_review_graph/incremental.py — DependentList with .truncated flag - tests/test_flows.py — orphan membership check - tests/test_search.py — double-rebuild atomicity test - tests/test_embeddings.py — context manager tests - tests/test_incremental.py — truncation flag tests Test results ------------ Stage 1 (new targeted tests): 6/6 passed. Stage 2 (all affected test files): 137 passed, 3 pre-existing failures (find_repo_root flakiness + UTF-8 gitignore — covered by PRs tirth8205#274/tirth8205#242). Stage 3 (full suite): 784 passed, 8 pre-existing Windows failures (all covered by parallel PRs). Stage 4 (ruff check on all 8 changed files): clean. Stage 5 (mypy on all 4 changed source files): clean. Zero regressions.

…stop_at (tirth8205#239, tirth8205#241) Five root-cause fixes that together resolve every pre-existing Windows test failure on main. Each fix has targeted regression tests; the net effect is full green CI on Windows (8 failures -> 0). Bug 1: get_data_dir() writes non-UTF-8 .gitignore on Windows (tirth8205#239) -------------------------------------------------------------------- write_text() called without encoding="utf-8". The em-dash in the header is U+2014 which Python encodes as cp1252 byte 0x97 on Windows. Any later UTF-8 read fails with UnicodeDecodeError. Fix: add encoding="utf-8" (matches sibling _ensure_repo_gitignore). Test: test_auto_gitignore_is_valid_utf8 — asserts UTF-8 byte sequence, rejects cp1252 byte. Bug 2: Databricks notebook detection fails on CRLF line endings (tirth8205#239) ---------------------------------------------------------------------- source.startswith(b"# Databricks notebook source\n") hard-codes LF. Windows git checkout (core.autocrlf=true) produces CRLF. All Databricks handling silently bypassed — 4 tests fail. Fix: parse first line robustly, strip trailing \r before exact match. Tests: test_databricks_header_crlf_line_endings, test_databricks_header_lf_line_endings_still_work, test_databricks_header_prefix_false_positive_rejected. Bug 3: Stale FastMCP API in async regression guard (tirth8205#239) --------------------------------------------------------- test_heavy_tools_are_coroutines called mcp.get_tools() which does not exist in fastmcp>=2.14.0 (pinned in pyproject.toml). The guard has been silently broken since it was written — the protection promised by PR tirth8205#231 for tirth8205#46/tirth8205#136 was never actually enforced. Fix: resolve tools via getattr(crg_main, name) like the sibling test. Drop @pytest.mark.asyncio since no event loop is needed. Test: test_regression_guard_does_not_depend_on_fastmcp_internals — AST-walks the guard source to ensure no mcp internal API references. Bug 4-5: find_repo_root walks above test sandbox (tirth8205#241) ------------------------------------------------------- test_returns_none_without_git and test_falls_back_to_start fail on any machine where tmp_path has a git-initialized ancestor (dotfiles repo at ~/.git — very common on developer machines). Fix: add optional stop_at parameter to find_repo_root() and find_project_root(). When set, the walk examines stop_at for .git and then stops. Default is None (existing walk-to-root behavior). Fully backward-compatible — all 7 production callers unchanged. Tests: test_stop_at_prevents_escape_to_outer_git, test_stop_at_finds_git_at_boundary, test_stop_at_forwarded_to_find_repo_root. Files changed ------------- - code_review_graph/incremental.py — encoding fix + stop_at API - code_review_graph/parser.py — CRLF-tolerant Databricks detection - tests/test_incremental.py — gitignore UTF-8 guard + stop_at tests - tests/test_main.py — fixed async guard + meta-guard - tests/test_notebook.py — CRLF + LF + false-positive guards

…irth8205#258, tirth8205#259, tirth8205#260, tirth8205#261) Four reliability hardening fixes that improve crash safety, resource cleanup, and API transparency across the graph pipeline. Fix 1: wrap incremental_trace_flows DELETEs in a transaction (tirth8205#258) ------------------------------------------------------------------- File: code_review_graph/flows.py:499-502 The DELETE loop in incremental_trace_flows deleted flow_memberships and flows rows one at a time without an explicit transaction. A crash mid-loop could leave orphaned flow_memberships rows pointing at deleted flows. Fix: wrap the DELETE loop in BEGIN IMMEDIATE / COMMIT / ROLLBACK, matching the pattern already used by store_flows at line 406. Includes an in_transaction guard to flush any implicit transaction first. Test: test_incremental_trace_flows_delete_is_atomic — verifies no orphaned memberships exist after an incremental re-trace. Fix 2: atomic FTS index rebuild (tirth8205#259) -------------------------------------- File: code_review_graph/search.py:39-54 rebuild_fts_index performed DROP, CREATE, and INSERT across three separate statements with commits between them. A crash after DROP but before CREATE would leave the DB without an FTS table entirely. Fix: wrap the entire DROP + CREATE + INSERT sequence in a single BEGIN IMMEDIATE transaction so the operation is all-or-nothing. Test: test_fts_rebuild_is_atomic — verifies double-rebuild leaves the table intact and queryable. Fix 3: EmbeddingStore context manager support (tirth8205#260) ---------------------------------------------------- File: code_review_graph/embeddings.py:391+ EmbeddingStore opens a SQLite connection in __init__ but did not implement __enter__/__exit__. If an exception occurred during usage before close() was called, the connection leaked. Fix: add __enter__ and __exit__ methods following the same pattern as GraphStore (graph.py:154-157). Test: test_supports_context_manager, test_context_manager_closes_on_exception. Fix 4: truncation signal for find_dependents (tirth8205#261) ---------------------------------------------------- File: code_review_graph/incremental.py:434-465 When find_dependents hit _MAX_DEPENDENT_FILES (500), it truncated the result and logged a warning, but the caller received a plain list with no indication that it was incomplete. Callers had to guess whether they got the full dependent set or a truncated one. Fix: introduce DependentList (a transparent list subclass with a .truncated boolean attribute). Existing callers that iterate, len(), or slice continue to work unchanged; only callers that specifically check .truncated benefit from the signal. Tests: test_truncated_flag_set_when_capped (600 deps, expects True), test_truncated_flag_false_when_not_capped (small chain, expects False). Files changed ------------- - code_review_graph/flows.py — transaction wrap around DELETE loop - code_review_graph/search.py — atomic FTS rebuild - code_review_graph/embeddings.py — __enter__/__exit__ on EmbeddingStore - code_review_graph/incremental.py — DependentList with .truncated flag - tests/test_flows.py — orphan membership check - tests/test_search.py — double-rebuild atomicity test - tests/test_embeddings.py — context manager tests - tests/test_incremental.py — truncation flag tests Test results ------------ Stage 1 (new targeted tests): 6/6 passed. Stage 2 (all affected test files): 137 passed, 3 pre-existing failures (find_repo_root flakiness + UTF-8 gitignore — covered by PRs tirth8205#274/tirth8205#242). Stage 3 (full suite): 784 passed, 8 pre-existing Windows failures (all covered by parallel PRs). Stage 4 (ruff check on all 8 changed files): clean. Stage 5 (mypy on all 4 changed source files): clean. Zero regressions.

This was referenced Apr 14, 2026

fix: three Windows-only bugs in v2.3.1 (#239) #240

Closed

fix(incremental): add stop_at to find_repo_root / find_project_root (#241) #242

Closed

azizur100389 force-pushed the fix/windows-test-reliability branch from eb1818a to dea0e93 Compare April 14, 2026 15:07

azizur100389 mentioned this pull request Apr 14, 2026

fix: transaction safety, resource cleanup, and truncation signaling (#258, #259, #260, #261) #279

Open

azizur100389 force-pushed the fix/windows-test-reliability branch from dea0e93 to cd0ecda Compare April 14, 2026 23:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: resolve all 8 Windows test failures — UTF-8, CRLF, FastMCP API, stop_at (#239, #241)#274

fix: resolve all 8 Windows test failures — UTF-8, CRLF, FastMCP API, stop_at (#239, #241)#274
azizur100389 wants to merge 1 commit intotirth8205:mainfrom
azizur100389:fix/windows-test-reliability

azizur100389 commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

azizur100389 commented Apr 14, 2026

Summary

Bug 1 — get_data_dir() writes non-UTF-8 .gitignore on Windows

Bug 2 — Databricks notebook detection fails on CRLF

Bug 3 — Stale FastMCP API in async regression guard

Bugs 4-5 — find_repo_root walks above test sandbox

Files changed (5 files, +302 / -35)

Test results

Why this fix is safe

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Bug 1 — `get_data_dir()` writes non-UTF-8 `.gitignore` on Windows

Bugs 4-5 — `find_repo_root` walks above test sandbox