Skip to content

Refactor: Java comparator result handling into focused helpers#2076

Open
shreejaykurhade wants to merge 2 commits intocodeflash-ai:mainfrom
shreejaykurhade:refactor/verification-comparator
Open

Refactor: Java comparator result handling into focused helpers#2076
shreejaykurhade wants to merge 2 commits intocodeflash-ai:mainfrom
shreejaykurhade:refactor/verification-comparator

Conversation

@shreejaykurhade
Copy link
Copy Markdown

@shreejaykurhade shreejaykurhade commented Apr 20, 2026

This PR refactors compare_test_results in codeflash/languages/java/comparator.py to reduce local complexity without changing comparison behavior. Issue #526

The original function handled several responsibilities in one place:

  • Java/runtime and SQLite path validation
  • comparator subprocess execution
  • comparator JSON parsing and stderr handling
  • translation of comparator diffs into TestDiff objects
  • final equivalence/summary validation and logging

That made the function harder to review and harder to modify safely, especially in a path that mixes subprocess failure handling with result translation.

What Changed

The result-handling portion of compare_test_results is now split into focused private helpers:

  • _parse_comparator_output
    Validates stdout, parses the JSON payload, and preserves the existing stderr/debug logging behavior.

  • _build_test_diffs
    Converts the Java comparator’s diff payload into TestDiff objects and keeps the per-diff debug logging in one place.

  • _finalize_comparison_result
    Applies the final summary validation (actualComparisons, skipped placeholder/deserialization counts) and centralizes the final equivalence logging.

compare_test_results now acts as the orchestration layer for:

  • prerequisite validation
  • Java comparator execution
  • error short-circuiting
  • output parsing
  • diff construction
  • final result selection

Why This Is Better

This improves maintainability by:

  • reducing the responsibility surface of the main comparator entry point
  • separating subprocess/output concerns from domain-level diff construction
  • making the comparison flow easier to scan top-to-bottom
  • lowering the risk of future changes to JSON parsing or diff translation accidentally affecting subprocess/error handling

This is intended to be a behavior-preserving refactor, not a comparison-logic change.

Behavior / Compatibility

No comparison behavior is intended to change.

In particular, this refactor preserves:

  • Java executable / runtime JAR validation
  • existing subprocess invocation and timeout behavior
  • empty-output and malformed-JSON handling
  • error handling for comparator-provided JSON errors
  • exit-code validation
  • TestDiff construction rules
  • the actualComparisons == 0 safeguard
  • existing summary/debug logging semantics

Validation

uv run pytest -q tests/test_languages/test_java/test_comparator.py
uv run ruff check codeflash/languages/java/comparator.py

Results:

  • 76 passed
  • ruff passed
image

Risk

Low.

This is a structural refactor in a well-covered module with dedicated Java comparator tests already in place. The refactor preserves the original execution order and error handling while only extracting cohesive result-processing steps.

@shreejaykurhade
Copy link
Copy Markdown
Author

@Saga4 @KRRT7 please review.

@KRRT7
Copy link
Copy Markdown
Collaborator

KRRT7 commented Apr 20, 2026

@shreejaykurhade hi, thanks for the PRs, can you hold off on future PRs? let me come up with issues and context so that you're better able to help us

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants