DRAFT: Add AI-powered issue triage agent (dry-run, opt-in)#40525
Draft
benhillis wants to merge 4 commits into
Draft
DRAFT: Add AI-powered issue triage agent (dry-run, opt-in)#40525benhillis wants to merge 4 commits into
benhillis wants to merge 4 commits into
Conversation
Adds a complementary triage agent for new issues that runs in parallel with the existing rule-based wti pipeline. Reads the issue prose, asks an LLM via GitHub Models to classify it, and posts a single collapsible maintainer- facing comment with: a 1-3 sentence summary, suggested issue type, suggested component labels, missing bug-template fields, and possible duplicate issues (retrieved via gh search issues, never invented by the model). v1 is dry-run only -- no labels are applied, no issue state is changed. The model is treated as untrusted: JSON output is schema-validated, labels are intersected with both a static allowlist and the live label list, duplicate numbers are intersected with the pre-fetched candidate set, and the summary is HTML-escaped with markdown links / URLs / @mentions stripped. Idempotency uses an input-sha hash embedded in the marker comment, with a post-call re-fetch to defeat slow-run-vs-newer-edit races. Failures (model error, malformed JSON, rate limit) are silent: log to stderr, exit 0, no comment posted. Files: - .github/workflows/ai_triage.yml - workflow on issues.opened - .github/workflows/ai_triage_tests.yml - pytest CI gate on PRs touching the script - triage/ai/ai_triage.py - the agent script - triage/ai/prompt.md - version-controlled prompt template - triage/ai/test_ai_triage.py - 79 pure-function unit tests - triage/ai/README.md - design + run instructions + graduation plan - triage/ai/.gitignore - exclude __pycache__/.pytest_cache Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- sanitize_summary now strips raw HTML tags (defense-in-depth; the function name implied this and html.escape() in render_comment was the only thing preventing tags from flowing through) - main() now exits 1 on unexpected exceptions instead of swallowing them; expected silent paths (model errors, transient gh API errors on read) still exit 0 inline. Comment-upsert failures now propagate, so a 403/5xx from posting fails the workflow visibly instead of showing green. - Workflow concurrency group falls back to github.run_id if both event payload and inputs are missing, preventing a collapsed-group deadlock. - REPO is now overridable via AI_TRIAGE_REPO env var so fork testing can target the fork's own issues without write-blocking 403s. - Updated README failure-mode section to document the silent-vs-loud split. - Added 3 sanitize_summary tests for HTML stripping (script tag, attribute strip, HTML comment). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Manual workflow_dispatch only until maintainers validate output quality on real issues. Re-enable by uncommenting the issues: trigger block. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Adds a new, opt-in AI-powered issue triage agent under triage/ai/ plus GitHub Actions workflows to run it manually (initially) and to gate changes with unit tests. This complements (and does not modify) the existing rule-based triage tooling.
Changes:
- Introduces
triage/ai/ai_triage.pyto fetch an issue, retrieve duplicate candidates, call GitHub Models, validate/clamp results, and upsert a single maintainer-facing comment. - Adds a version-controlled prompt (
triage/ai/prompt.md) and unit tests (triage/ai/test_ai_triage.py) focused on validation/sanitization/idempotency behavior. - Adds Actions workflows for manual dispatch triage and for running pytest on PRs touching
triage/ai/**.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
triage/ai/ai_triage.py |
Core triage agent: issue fetch, candidate retrieval, model call, JSON extraction/validation, sanitization, comment upsert. |
triage/ai/prompt.md |
Prompt template and strict output schema/allowlists for the model. |
triage/ai/test_ai_triage.py |
Unit tests covering parsing, validation/clamping, sanitization, hashing/marker behavior, and skip rules. |
triage/ai/README.md |
Design/usage documentation and operational posture (skip rules, failure modes, rollout plan). |
triage/ai/.gitignore |
Ignores Python cache and pytest artifacts for the new triage module. |
.github/workflows/ai_triage.yml |
Manual-dispatch workflow to run the triage agent and post/update the comment. |
.github/workflows/ai_triage_tests.yml |
CI workflow running pytest for triage/ai/** changes. |
- render_prompt: switch from sequential .replace() to single-pass re.sub. Sequential replacement let an issue body containing '{{CANDIDATES_JSON}}' (or any other placeholder token) get rewritten by a later substitution, giving the issue author a way to alter the prompt. Closes the inline review thread on ai_triage.py:331.
- find_existing_marker_comment: paginate up to 10 pages (1000 comments) instead of fetching only the first 100. Stops on first marker hit, on a short page (last page), or at the safety cap. Closes the inline thread on ai_triage.py:575.
- Workflow input 'issue' is now type: number so non-numeric values are rejected at the dispatch UI instead of crashing the script.
- README files-table updated to reflect the manual-only initial rollout.
- Removed dead _JSON_OBJECT_RE constant left over from when extract_json_object used a regex.
- Added 9 unit tests: 4 for render_prompt (basic substitution + 2 prompt-injection regression tests + unknown-placeholder passthrough) and 5 for find_existing_marker_comment (page-1 hit, page-2 hit, short-page stop, empty-page stop, cap).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Member
Author
|
Thanks for the review! All 5 findings addressed in bad3a6e.
Tests: 91 pass (was 82). All ~0.2s, no network. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds an opt-in, AI-powered issue triage agent that complements the existing rule-based
wti.exepipeline. Purely additive —wtiand its workflows (new_issue.yml,new_issue_comment.yml,issue_edited.yml) are completely untouched.For a designated issue, the agent reads the title and body, asks an LLM (via GitHub Models) to classify it and surface possible duplicates, then posts a single collapsible maintainer-facing comment with its analysis.
What it does
gh api.gh search issues(the model picks from this list — it cannot invent issue numbers).openai/gpt-4o-minivia thegh-modelsextension with a version-controlled prompt.@mentionsdefanged with U+200B; emails preserved by negative lookbehind).What it does NOT do (v1)
Example of the resulting comment:
Initial rollout: manual-dispatch only
The workflow's
issues.openedtrigger is commented out in this PR. The only way to invoke it is Actions → AI issue triage (dry-run) → Run workflow → enter issue number. This lets the team vet output quality on hand-picked real issues before opening the firehose. To enable automatic triage on every new issue later, uncomment theissues:block in.github/workflows/ai_triage.yml.Suggested rollout:
triage/ai/prompt.mdif needed.issues.opened.triage/ai/README.md, eventually consider v2 auto-labeling for high-confidence subsets.Security / abuse posture
gh label listoutput. Duplicate numbers must appear in the pre-fetched candidate set.html.escape()'d during render. Renders inside a<details>collapsed by default.author.type == "Bot"and[bot]login suffix) and maintainer-levelauthor_association(OWNER / MEMBER / COLLABORATOR).<!-- ai-triage:v1 input-sha=… prompt-sha=… -->is keyed on(title, body, prompt-version), re-checked after the model call.concurrency: cancel-in-progressper issue prevents pile-ups on rapid edits.Permissions
Tests
triage/ai/test_ai_triage.py— 82 unit tests, no network, ~0.2 s. Covers JSON extraction, validation/clamping, sanitization (including HTML / URL / mention / drift guards comparing the Python allowlists to the prompt template), idempotency hashing, marker parsing, comment rendering, and skip rules. Wired into CI via.github/workflows/ai_triage_tests.ymlwhich runs on any PR touchingtriage/ai/**or.github/workflows/ai_triage*.yml.End-to-end smoke testing was done from
benhillis/WSL:bug/wsl2+installand surfaced Cannot start WSL: Wsl/Service/CreateInstance/CreateVm/MountVhd/HCS/ERROR_FILE_NOT_FOUND #12753 and wsl startup failed after windows update Error code: Wsl/Service/CreateInstance/CreateVm/MountVhd/HCS/ERROR_NOT_FOUND #11253 as real duplicates.feature/distro-mgmt.gh extension install github/gh-models,models: readpermission resolution, model call, JSON parsing, validation, dedup retrieval, staleness re-check). Comment-upsert correctly failed loudly with HTTP 403 because the fork token has no write access to the upstream repo.Files
.github/workflows/ai_triage.yml.github/workflows/ai_triage_tests.ymltriage/ai/ai_triage.pytriage/ai/prompt.mdtriage/ai/test_ai_triage.pytriage/ai/README.mdtriage/ai/.gitignore__pycache__/.pytest_cacheCost
Manual-dispatch only at first → effectively zero. Once
issues.openedis enabled, onegpt-4o-minicall per non-skipped new issue (typical ~3–5K input tokens, ~300 output tokens). Within GitHub Models free quota for current WSL issue volume.How to disable / roll back
.github/workflows/ai_triage.yml.Local run
cd triage/ai python ai_triage.py --issue 12345 --dry-runRequires
ghCLI authenticated, andgh extension install github/gh-models.Opening as draft for design discussion before merge. Happy to split into multiple PRs (workflow + script + tests + docs) if that's preferred for review.