Skip to content

chore: sync core lib and CLAUDE.md from agent-core#42

Closed
avifenesh wants to merge 1 commit into
mainfrom
chore/sync-core-next-task-20260529-110047
Closed

chore: sync core lib and CLAUDE.md from agent-core#42
avifenesh wants to merge 1 commit into
mainfrom
chore/sync-core-next-task-20260529-110047

Conversation

@avifenesh

Copy link
Copy Markdown
Contributor

Automated sync of lib/ and CLAUDE.md from agent-core.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request attempts to mitigate potential Regular Expression Denial of Service (ReDoS) vulnerabilities across several files by bounding various quantifiers and character classes. However, the review feedback points out that several of the original regular expressions—specifically those matching markdown links, module exports, and bad example blocks—were already safe from ReDoS. Furthermore, introducing arbitrary length limits on these patterns can cause regressions by failing to match valid, large inputs (such as long URLs, large configuration exports, or extensive code blocks). It is recommended to revert these specific patterns to their original, safe forms as suggested.

Comment on lines +27 to +29
// ReDoS fix: bound the negated-class captures so the matcher is linear;
// bounds far exceed any realistic markdown link, so matches are unchanged.
const linkRegex = /\[([^\]]{1,2000})\]\(([^)]{1,4000})\)/g;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The original regular expression /\\\[([^\\\\]+)\\\]\\(([^)]+)\\)/g does not have a ReDoS vulnerability. Because the negated character classes [^\\\\]+ and [^)]+ are strictly bounded by the literal characters \\\\] and \\) respectively, there is no overlapping match state that can cause catastrophic backtracking.\n\nBy introducing arbitrary limits like {1,2000} and {1,4000}, this regex will fail to match valid markdown links that contain long URLs, such as those with extensive query parameters, tracking tokens, or inline base64 data URIs (which are very common in markdown image links).\n\nWe should revert this to the original safe pattern.

Suggested change
// ReDoS fix: bound the negated-class captures so the matcher is linear;
// bounds far exceed any realistic markdown link, so matches are unchanged.
const linkRegex = /\[([^\]]{1,2000})\]\(([^)]{1,4000})\)/g;
const linkRegex = /\[([^\]]+)\]\(([^)]+)\)/g;

Comment on lines +147 to +149
// ReDoS fix: bound the \s* runs and capture length so the matcher stays linear;
// bounds exceed any realistic module.exports declaration so matches are unchanged.
const moduleExportsPattern = /module\.exports\s{0,1000}=\s{0,1000}\{([^}]{1,100000})\}/;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The original regular expression /module\\.exports\\s*=\\s*\\{([^}]+)\\}/ is completely safe from ReDoS. The negated character class [^}]+ is terminated by the literal \\}, meaning there is no ambiguity or overlapping paths for the regex engine to backtrack exponentially.\n\nLimiting the capture to 100,000 characters ({1,100000}) will cause the parser to fail on large, valid JavaScript files (such as large configuration files, localized translation dictionaries, or auto-generated asset maps) where the exported object exceeds 100KB.\n\nWe should revert this to the original safe pattern.

Suggested change
// ReDoS fix: bound the \s* runs and capture length so the matcher stays linear;
// bounds exceed any realistic module.exports declaration so matches are unchanged.
const moduleExportsPattern = /module\.exports\s{0,1000}=\s{0,1000}\{([^}]{1,100000})\}/;
const moduleExportsPattern = /module\.exports\s*=\s*\{([^}]+)\}/;

Comment on lines +117 to +121
// ReDoS fix: bound the lazy [\s\S]*? bodies so an unterminated <bad-example> or
// ``` fence cannot drive polynomial backtracking; 50k chars covers any realistic
// example block, so the stripped regions are unchanged for real content.
const BAD_EXAMPLE_TAG_PATTERN = /<bad[_\- ]?example>[\s\S]{0,50000}?<\/bad[_\- ]?example>/gi;
const BAD_EXAMPLE_CODE_PATTERN = /```[^\n]{0,500}bad[^\n]{0,500}\n[\s\S]{0,50000}?```/gi;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The original regular expressions for stripping bad examples do not possess ReDoS vulnerabilities. Lazy matching over all characters ([\\s\\S]*?) is a linear scan and does not cause catastrophic backtracking because there are no nested quantifiers or overlapping active groups.\n\nBy capping the match at 50,000 characters, any large <bad-example> block or code block (which can easily exceed 50KB when demonstrating large prompts or complex codebases) will fail to be stripped. This will lead to false positives during cross-file analysis.\n\nWe should revert these to their original safe patterns.

Suggested change
// ReDoS fix: bound the lazy [\s\S]*? bodies so an unterminated <bad-example> or
// ``` fence cannot drive polynomial backtracking; 50k chars covers any realistic
// example block, so the stripped regions are unchanged for real content.
const BAD_EXAMPLE_TAG_PATTERN = /<bad[_\- ]?example>[\s\S]{0,50000}?<\/bad[_\- ]?example>/gi;
const BAD_EXAMPLE_CODE_PATTERN = /```[^\n]{0,500}bad[^\n]{0,500}\n[\s\S]{0,50000}?```/gi;
const BAD_EXAMPLE_TAG_PATTERN = /<bad[_\- ]?example>[\s\S]*?<\/bad[_\- ]?example>/gi;
const BAD_EXAMPLE_CODE_PATTERN = /```[^\n]*bad[^\n]*\n[\s\S]*?```/gi;

@avifenesh

Copy link
Copy Markdown
Contributor Author

Superseded by latest sync (post ReDoS + prototype-pollution hardening).

@avifenesh avifenesh closed this May 29, 2026
@avifenesh avifenesh deleted the chore/sync-core-next-task-20260529-110047 branch May 29, 2026 11:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant