Skip to content

fix: resolve all 8 Windows test failures — UTF-8, CRLF, FastMCP API, stop_at (#239, #241)#274

Open
azizur100389 wants to merge 1 commit intotirth8205:mainfrom
azizur100389:fix/windows-test-reliability
Open

fix: resolve all 8 Windows test failures — UTF-8, CRLF, FastMCP API, stop_at (#239, #241)#274
azizur100389 wants to merge 1 commit intotirth8205:mainfrom
azizur100389:fix/windows-test-reliability

Conversation

@azizur100389
Copy link
Copy Markdown
Contributor

Summary

Five root-cause fixes that together resolve every pre-existing Windows test failure on main. Net effect: 8 failures → 0, full green Windows CI.

Closes #239, closes #241.

Supersedes individual PRs #240 and #242 (which are being closed in favor of this bundled fix).


Bug 1 — get_data_dir() writes non-UTF-8 .gitignore on Windows

write_text() called without encoding="utf-8". The em-dash (U+2014) in the header encodes as cp1252 byte 0x97 on Windows. Any later UTF-8 read raises UnicodeDecodeError.

Fix: Add encoding="utf-8" (matches sibling _ensure_repo_gitignore).
Test: test_auto_gitignore_is_valid_utf8 — asserts UTF-8 byte sequence 0xE2 0x80 0x94, rejects cp1252 byte 0x97.

Bug 2 — Databricks notebook detection fails on CRLF

source.startswith(b"# Databricks notebook source\n") hard-codes LF. Windows core.autocrlf=true produces CRLF → 4 tests fail silently.

Fix: Parse first line robustly, strip trailing \r before exact match. Rejects prefix false positives.
Tests: test_databricks_header_crlf_line_endings, test_databricks_header_lf_line_endings_still_work, test_databricks_header_prefix_false_positive_rejected.

Bug 3 — Stale FastMCP API in async regression guard

mcp.get_tools() doesn't exist in fastmcp>=2.14.0. The guard promised by PR #231 for #46/#136 has been silently broken since it was written.

Fix: Resolve tools via getattr(crg_main, name) (mirrors sibling test). Drop @pytest.mark.asyncio.
Test: test_regression_guard_does_not_depend_on_fastmcp_internals — AST-walks guard source to reject stale mcp internal APIs.

Bugs 4-5 — find_repo_root walks above test sandbox

test_returns_none_without_git and test_falls_back_to_start fail on any machine where tmp_path has a git-initialized ancestor (dotfiles at ~/.git).

Fix: Add optional stop_at: Path | None = None to find_repo_root() and find_project_root(). Walk examines stop_at for .git then stops. Default None = existing behavior. All 7 production callers unchanged.
Tests: test_stop_at_prevents_escape_to_outer_git, test_stop_at_finds_git_at_boundary, test_stop_at_forwarded_to_find_repo_root.


Files changed (5 files, +302 / -35)

  • code_review_graph/incremental.py — UTF-8 encoding fix + stop_at API addition
  • code_review_graph/parser.py — CRLF-tolerant Databricks detection
  • tests/test_incremental.py — gitignore UTF-8 guard + stop_at regression tests
  • tests/test_main.py — fixed async guard + meta-guard against FastMCP API drift
  • tests/test_notebook.py — CRLF + LF + false-positive regression guards

Test results

  • 23/23 targeted tests pass (all new + all previously-failing)
  • Full suite on Windows: 0 pre-existing failures remaining
  • ruff check on all 5 changed files: clean

Why this fix is safe

  • UTF-8 fix matches established pattern in sibling function
  • Databricks fix uses stricter matching than original (== not startswith)
  • FastMCP fix is test-only; product code unchanged
  • stop_at is backward-compatible (default None = existing behavior)

@azizur100389 azizur100389 force-pushed the fix/windows-test-reliability branch from eb1818a to dea0e93 Compare April 14, 2026 15:07
azizur100389 added a commit to azizur100389/code-review-graph that referenced this pull request Apr 14, 2026
…irth8205#258, tirth8205#259, tirth8205#260, tirth8205#261)

Four reliability hardening fixes that improve crash safety, resource
cleanup, and API transparency across the graph pipeline.

Fix 1: wrap incremental_trace_flows DELETEs in a transaction (tirth8205#258)
-------------------------------------------------------------------
File: code_review_graph/flows.py:499-502

The DELETE loop in incremental_trace_flows deleted flow_memberships and
flows rows one at a time without an explicit transaction.  A crash
mid-loop could leave orphaned flow_memberships rows pointing at deleted
flows.

Fix: wrap the DELETE loop in BEGIN IMMEDIATE / COMMIT / ROLLBACK,
matching the pattern already used by store_flows at line 406.  Includes
an in_transaction guard to flush any implicit transaction first.

Test: test_incremental_trace_flows_delete_is_atomic — verifies no
orphaned memberships exist after an incremental re-trace.

Fix 2: atomic FTS index rebuild (tirth8205#259)
--------------------------------------
File: code_review_graph/search.py:39-54

rebuild_fts_index performed DROP, CREATE, and INSERT across three
separate statements with commits between them.  A crash after DROP but
before CREATE would leave the DB without an FTS table entirely.

Fix: wrap the entire DROP + CREATE + INSERT sequence in a single
BEGIN IMMEDIATE transaction so the operation is all-or-nothing.

Test: test_fts_rebuild_is_atomic — verifies double-rebuild leaves the
table intact and queryable.

Fix 3: EmbeddingStore context manager support (tirth8205#260)
----------------------------------------------------
File: code_review_graph/embeddings.py:391+

EmbeddingStore opens a SQLite connection in __init__ but did not
implement __enter__/__exit__.  If an exception occurred during usage
before close() was called, the connection leaked.

Fix: add __enter__ and __exit__ methods following the same pattern as
GraphStore (graph.py:154-157).

Test: test_supports_context_manager, test_context_manager_closes_on_exception.

Fix 4: truncation signal for find_dependents (tirth8205#261)
----------------------------------------------------
File: code_review_graph/incremental.py:434-465

When find_dependents hit _MAX_DEPENDENT_FILES (500), it truncated the
result and logged a warning, but the caller received a plain list with
no indication that it was incomplete.  Callers had to guess whether
they got the full dependent set or a truncated one.

Fix: introduce DependentList (a transparent list subclass with a
.truncated boolean attribute).  Existing callers that iterate, len(),
or slice continue to work unchanged; only callers that specifically
check .truncated benefit from the signal.

Tests: test_truncated_flag_set_when_capped (600 deps, expects True),
test_truncated_flag_false_when_not_capped (small chain, expects False).

Files changed
-------------
- code_review_graph/flows.py — transaction wrap around DELETE loop
- code_review_graph/search.py — atomic FTS rebuild
- code_review_graph/embeddings.py — __enter__/__exit__ on EmbeddingStore
- code_review_graph/incremental.py — DependentList with .truncated flag
- tests/test_flows.py — orphan membership check
- tests/test_search.py — double-rebuild atomicity test
- tests/test_embeddings.py — context manager tests
- tests/test_incremental.py — truncation flag tests

Test results
------------
Stage 1 (new targeted tests): 6/6 passed.
Stage 2 (all affected test files): 137 passed, 3 pre-existing failures
  (find_repo_root flakiness + UTF-8 gitignore — covered by PRs tirth8205#274/tirth8205#242).
Stage 3 (full suite): 784 passed, 8 pre-existing Windows failures
  (all covered by parallel PRs).
Stage 4 (ruff check on all 8 changed files): clean.
Stage 5 (mypy on all 4 changed source files): clean.

Zero regressions.
…stop_at (tirth8205#239, tirth8205#241)

Five root-cause fixes that together resolve every pre-existing Windows
test failure on main.  Each fix has targeted regression tests; the net
effect is full green CI on Windows (8 failures -> 0).

Bug 1: get_data_dir() writes non-UTF-8 .gitignore on Windows (tirth8205#239)
--------------------------------------------------------------------
write_text() called without encoding="utf-8".  The em-dash in the header
is U+2014 which Python encodes as cp1252 byte 0x97 on Windows.  Any later
UTF-8 read fails with UnicodeDecodeError.

Fix: add encoding="utf-8" (matches sibling _ensure_repo_gitignore).
Test: test_auto_gitignore_is_valid_utf8 — asserts UTF-8 byte sequence,
rejects cp1252 byte.

Bug 2: Databricks notebook detection fails on CRLF line endings (tirth8205#239)
----------------------------------------------------------------------
source.startswith(b"# Databricks notebook source\n") hard-codes LF.
Windows git checkout (core.autocrlf=true) produces CRLF.  All Databricks
handling silently bypassed — 4 tests fail.

Fix: parse first line robustly, strip trailing \r before exact match.
Tests: test_databricks_header_crlf_line_endings,
test_databricks_header_lf_line_endings_still_work,
test_databricks_header_prefix_false_positive_rejected.

Bug 3: Stale FastMCP API in async regression guard (tirth8205#239)
---------------------------------------------------------
test_heavy_tools_are_coroutines called mcp.get_tools() which does not
exist in fastmcp>=2.14.0 (pinned in pyproject.toml).  The guard has
been silently broken since it was written — the protection promised by
PR tirth8205#231 for tirth8205#46/tirth8205#136 was never actually enforced.

Fix: resolve tools via getattr(crg_main, name) like the sibling test.
Drop @pytest.mark.asyncio since no event loop is needed.
Test: test_regression_guard_does_not_depend_on_fastmcp_internals —
AST-walks the guard source to ensure no mcp internal API references.

Bug 4-5: find_repo_root walks above test sandbox (tirth8205#241)
-------------------------------------------------------
test_returns_none_without_git and test_falls_back_to_start fail on any
machine where tmp_path has a git-initialized ancestor (dotfiles repo at
~/.git — very common on developer machines).

Fix: add optional stop_at parameter to find_repo_root() and
find_project_root().  When set, the walk examines stop_at for .git and
then stops.  Default is None (existing walk-to-root behavior).  Fully
backward-compatible — all 7 production callers unchanged.
Tests: test_stop_at_prevents_escape_to_outer_git,
test_stop_at_finds_git_at_boundary,
test_stop_at_forwarded_to_find_repo_root.

Files changed
-------------
- code_review_graph/incremental.py — encoding fix + stop_at API
- code_review_graph/parser.py — CRLF-tolerant Databricks detection
- tests/test_incremental.py — gitignore UTF-8 guard + stop_at tests
- tests/test_main.py — fixed async guard + meta-guard
- tests/test_notebook.py — CRLF + LF + false-positive guards
@azizur100389 azizur100389 force-pushed the fix/windows-test-reliability branch from dea0e93 to cd0ecda Compare April 14, 2026 23:03
azizur100389 added a commit to azizur100389/code-review-graph that referenced this pull request Apr 14, 2026
…irth8205#258, tirth8205#259, tirth8205#260, tirth8205#261)

Four reliability hardening fixes that improve crash safety, resource
cleanup, and API transparency across the graph pipeline.

Fix 1: wrap incremental_trace_flows DELETEs in a transaction (tirth8205#258)
-------------------------------------------------------------------
File: code_review_graph/flows.py:499-502

The DELETE loop in incremental_trace_flows deleted flow_memberships and
flows rows one at a time without an explicit transaction.  A crash
mid-loop could leave orphaned flow_memberships rows pointing at deleted
flows.

Fix: wrap the DELETE loop in BEGIN IMMEDIATE / COMMIT / ROLLBACK,
matching the pattern already used by store_flows at line 406.  Includes
an in_transaction guard to flush any implicit transaction first.

Test: test_incremental_trace_flows_delete_is_atomic — verifies no
orphaned memberships exist after an incremental re-trace.

Fix 2: atomic FTS index rebuild (tirth8205#259)
--------------------------------------
File: code_review_graph/search.py:39-54

rebuild_fts_index performed DROP, CREATE, and INSERT across three
separate statements with commits between them.  A crash after DROP but
before CREATE would leave the DB without an FTS table entirely.

Fix: wrap the entire DROP + CREATE + INSERT sequence in a single
BEGIN IMMEDIATE transaction so the operation is all-or-nothing.

Test: test_fts_rebuild_is_atomic — verifies double-rebuild leaves the
table intact and queryable.

Fix 3: EmbeddingStore context manager support (tirth8205#260)
----------------------------------------------------
File: code_review_graph/embeddings.py:391+

EmbeddingStore opens a SQLite connection in __init__ but did not
implement __enter__/__exit__.  If an exception occurred during usage
before close() was called, the connection leaked.

Fix: add __enter__ and __exit__ methods following the same pattern as
GraphStore (graph.py:154-157).

Test: test_supports_context_manager, test_context_manager_closes_on_exception.

Fix 4: truncation signal for find_dependents (tirth8205#261)
----------------------------------------------------
File: code_review_graph/incremental.py:434-465

When find_dependents hit _MAX_DEPENDENT_FILES (500), it truncated the
result and logged a warning, but the caller received a plain list with
no indication that it was incomplete.  Callers had to guess whether
they got the full dependent set or a truncated one.

Fix: introduce DependentList (a transparent list subclass with a
.truncated boolean attribute).  Existing callers that iterate, len(),
or slice continue to work unchanged; only callers that specifically
check .truncated benefit from the signal.

Tests: test_truncated_flag_set_when_capped (600 deps, expects True),
test_truncated_flag_false_when_not_capped (small chain, expects False).

Files changed
-------------
- code_review_graph/flows.py — transaction wrap around DELETE loop
- code_review_graph/search.py — atomic FTS rebuild
- code_review_graph/embeddings.py — __enter__/__exit__ on EmbeddingStore
- code_review_graph/incremental.py — DependentList with .truncated flag
- tests/test_flows.py — orphan membership check
- tests/test_search.py — double-rebuild atomicity test
- tests/test_embeddings.py — context manager tests
- tests/test_incremental.py — truncation flag tests

Test results
------------
Stage 1 (new targeted tests): 6/6 passed.
Stage 2 (all affected test files): 137 passed, 3 pre-existing failures
  (find_repo_root flakiness + UTF-8 gitignore — covered by PRs tirth8205#274/tirth8205#242).
Stage 3 (full suite): 784 passed, 8 pre-existing Windows failures
  (all covered by parallel PRs).
Stage 4 (ruff check on all 8 changed files): clean.
Stage 5 (mypy on all 4 changed source files): clean.

Zero regressions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant