Feature/tokenless diff provider#2467
Conversation
…Agent#2457) - Add publish_file_comments to is_supported() deny-list so describe walkthrough is not silently dropped with inline_file_summary=true - Guard get_diff_files() against path traversal: reject absolute paths and relative paths that resolve outside the repo root; log and fall back to patch-only mode instead of raising - Fix tokenless_diff_mode.md comparison table: local provider inline comments are Not supported (raises NotImplementedError) - Promote page title from H2 to H1 to match other usage-guide pages - Add tests: test_publish_file_comments_not_supported, test_path_traversal_file_not_read
PR Summary by QodoAdd tokenless unified-diff provider (stdin/file) with stdout output Description
Diagram
High-Level Assessment
Files changed (11)
|
Code Review by Qodo
Context used 1.
|
- cli: handle OSError/UnicodeDecodeError when reading --diff-file, failing fast with a clear parser.error instead of an uncaught traceback - diff_parsing: catch the specific UnidiffParseError (not bare Exception) in reconstruct_base_file; keep an empty reconstructed base as "" (never "\n") so downstream extend_patch() treats it as having no original file - diff_provider: narrow exception handling to UnidiffParseError for parsing and (OSError, UnicodeDecodeError) for working-tree reads; resolve diff paths against the repository root (_find_repository_root) instead of the raw CWD so enrichment still works when run from a subdirectory - tests: update add-to-empty base expectation to ""; add regression tests for subdirectory enrichment and missing --diff-file fail-fast; remove unused imports (F401)
|
Code review by qodo was updated up to the latest commit 304384d |
|
Hey @vectorkovacspeter, thanks for taking this! |
|
Thanks for the review. I've addressed all six findings in 304384d:
Added regression tests for the subdirectory case and the missing-file fail-fast; suite is now 25 passing. |
|
Hey @vectorkovacspeter, Could you also please address the first 3 that Qodo hasn't marked as resolved? |
… mode Addresses the new findings raised after the previous fix round: - improve crash (correctness): DiffGitProvider now implements publish_code_suggestions()/publish_code_suggestion() to render suggestions as a markdown document to stdout/--output instead of raising NotImplementedError, so `--stdin/--diff-file improve` works as documented - CWD file-disclosure risk (security): when no .git repository root is detected, working-tree enrichment is disabled (patch-only) instead of reading files from an arbitrary current directory - broad except (reliability): _write_output() now catches only (OSError, UnicodeError) and re-raises, since --output is an explicit request whose failure must not be swallowed - style: double quotes for the new CLI args; wrapped the long publish_inline_comment signature to satisfy Ruff - docs: note that improve renders suggestions as markdown in this mode - tests: add coverage for code-suggestion rendering (and empty no-op), no-repo-root patch-only mode; harden the path-traversal sentinel with a real .git root so it isolates the traversal guard
|
Done — pushed dfb19b2. Addressed the new findings:
Added tests for the suggestion rendering and the no-repo-root path; suite is 28 passing. |
|
Code review by qodo was updated up to the latest commit dfb19b2 |
|
Oh man, this bot is a hard one , but i working on it 😄 |
…ff mode - incremental crash (correctness): DiffGitProvider.get_incremental_commits() now disables incremental mode (-i has no meaning for a standalone diff), preventing a TypeError when PRReviewer takes len() of an empty commits_range - extra-config override (correctness): provider selection now forces the diff provider whenever diff content is loaded, so an extra/repo config setting config.git_provider can no longer silently derail tokenless mode - suppressed output (correctness): CLI diff mode forces config.publish_output =True so stdout/--output is always produced even if publishing was disabled - global mutation (reliability): removed the dead pr_reviewer.inline_code_comments=False write in __init__ (never read anywhere; was copied from LocalGitProvider) - test isolation (reliability): added an autouse fixture to each diff test module that snapshots and restores the mutated global settings keys - tests: added coverage for incremental-disabled, diff-content-forces-provider, and publish_output forcing
|
Pushed a05bce3 addressing the new findings:
|
|
Code review by qodo was updated up to the latest commit a05bce3 |
…und 4) - patch misparsed as hunks (correctness): the stored FilePatchInfo.patch was str(pf) from unidiff, which keeps the 'diff --git'/'index'/'---'/'+++' headers before the first '@@'. The shared hunk/line-number converter treats '+'/'-' lines as content, so those headers were emitted as a bogus leading hunk with invalid line numbers into the LLM prompt. Added to_hunk_only_patch() and normalize f.patch after base reconstruction (which still needs the full headers to parse), matching platform providers. - isort (I001): collapsed the two-name diff_parsing import to one line and sorted imports in the diff test modules; verified with ruff. - test isolation: tests now mutate settings only through an autouse `cfg` fixture that snapshots and restores the diff-mode keys (covers run()'s indirect mutations too); no bare get_settings().set() in test bodies. - test style: replaced `assert False` with pytest.raises (B011). - tests: added a hunk-only-patch assertion.
|
Pushed ae473c4. Addressed the findings:
32 tests pass. Note: I scoped changes to this feature and left pre-existing repo lint (e.g. the duplicate GiteaProvider import, some long lines) untouched to keep the diff focused — happy to fold those in separately if you'd like. |
|
Code review by qodo was updated up to the latest commit ae473c4 |
|
@naorpeled all of Qodo's findings are now resolved — on the latest push it finally came back empty-handed. 🎉 Genuinely impressed with that review bot, by the way: it caught a couple of real ones I'd have shipped (the improve crash and the diff-metadata-misparsed-as-hunks bug were proper catches). It's also… thorough. Four rounds deep I'm fairly sure it would flag the Mona Lisa for an unsorted import. 😄 No complaints though — the code is better for it. Ready for your review whenever you have a chance, and happy to make any further changes you'd like. |
feature: tokenless local diff mode (
--stdin/--diff-file)Closes #2457
Summary
Adds a purely local, stateless way to run PR-Agent against a unified diff supplied on
stdinor from a file — no git-platform API token and no PR URL required. Thegenerated review/improvements/description are printed to
stdout(markdown), and canadditionally be written to a file with
--output.Supported commands:
review,improve,describe,ask.Why
Today PR-Agent requires a platform-specific token (GitHub PAT/App, GitLab token, …)
because it talks to the hosting platform's API to fetch PR data and publish comments.
In security-first / restricted environments, teams often manage repositories over SSH
and deliberately avoid issuing HTTP access tokens, since every static token is an extra
credential that can leak.
This feature lets those users get AI code reviews while keeping a strict, token-free
posture:
host's API is blocked.
pre-commit/pre-pushhooks.How to use
--stdin--diff-file <path>--output <path>The provider is selected automatically when
--stdinor--diff-fileis present, andthe usual
--pr_url/--issue_urlrequirement is bypassed.Any LLM supported by PR-Agent works, including a fully local / OpenAI-compatible
endpoint (e.g. Ollama or vLLM) for an end-to-end air-gapped setup.
How it works
git_provider = "diff"(DiffGitProvider), parses the unifieddiff with the
unidifflibrary into PR-Agent'sFilePatchInfoobjects.reverse-applies the diff to reconstruct the original (base) file, so each hunk
gets full surrounding context (
extend_patch) — matching the quality of the normalAPI path rather than only the diff's few context lines.
back to a patch-only review instead of producing a wrong base file.
stdout, and to--output <path>when provided.Design notes / scope
branch-based
localprovider (which requires a clean repo and a target branch). Thelocalprovider is left untouched.committable code suggestions, labels, reactions) are no-ops /
NotImplementedError,mirroring
LocalGitProvider;is_supported()reports them as unsupported so thetools degrade gracefully.
..-escapes from the diff are rejected — so an untrusted diff cannot make the toolread files outside the repo.
unidiff(MIT), added torequirements.txt.Testing
New tests under
tests/unittest/:test_diff_parsing.py— parsing (modify/add/delete/rename/binary) and base-filereverse-apply (success, drift → patch-only, multi-hunk, CRLF, EOF-newline).
test_diff_provider.py— provider behavior, stdout/file output, temporary-commentsuppression, malformed-diff guard, capability gating, and a path-traversal guard
(with a sentinel test verified to fail if the guard is removed).
test_cli_diff_mode.py— CLI flag parsing.test_diff_mode_e2e.py— base reconstruction from the working tree and a mocked-LLMrun through
PRReviewer.All diff-feature tests pass. Unrelated pre-existing test-collection errors in some
test_mosaico_*files (optional dependencies not installed) are not affected by thischange.
Documentation
Adds
docs/docs/usage-guide/tokenless_diff_mode.mdand a navigation entry indocs/mkdocs.yml, including usage, the working-tree vs. patch-only quality trade-off,and how this differs from the
localprovider.AI assistance disclosure
In the interest of transparency: this contribution was developed with substantial help
from an AI coding assistant — it contains AI-generated and/or AI-assisted code,
tests, and documentation. All of it was reviewed, tested, and validated by me before
submission, and I take full responsibility for the contents of this PR. I'm happy to
adjust anything to fit the project's conventions and review feedback.