refactor(mdxish): replace brace-escaping preprocessing with lenient expression tokenizer by eaglethrost · Pull Request #1531 · readmeio/markdown

eaglethrost · 2026-07-02T09:38:42Z

🎫 Resolve RM-17232

🎯 What does this PR do?

Removes the brace-escaping preprocessing in favour of a lenient MDX expression tokenizer.

We've carried escapeProblematicBraces for a while — a string pre-pass that tries to predict which {…} the parser will choke on and escapes them to \{. It's worked, but we keep having to iterate on it: every new case (code blocks, attributes, HTML boundaries, export ranges…) is another carve-out, because a regex/state-machine can only ever approximate the real grammar. It's gotten convoluted and fragile enough that patching it further felt like the wrong direction.

So I explored getting rid of it entirely to make brace handling more stable. This is the result: a ~90-line micromark tokenizer (lib/micromark/mdx-expression-lenient/) that returns nok on an unbalanced brace instead of throwing — micromark rolls back and the { renders as literal text, decided by the real grammar rather than a heuristic. Deletes ~1,600 LOC (the pre-pass + its tests) for ~90 LOC of tokenizer, and the disagreements just stop happening. Seems to work well across the suite.

Examples:

{1 + 1}            → 2                (balanced evaluates)
Hello {user.name   → Hello {user.name (unbalanced stays literal, no \{)
<div>{foo </div>   → <div>{foo </div> (no backslash leak)

export default function Page() {
  return <div>{cond ? (<p>a</p>) : (<p>b</p>)}</div>;
}

<Page />

Previously threw Could not parse import/exports with acorn; now renders.

🧪 QA tips

{1+1} still evaluates; unbalanced/paragraph-spanning braces render literally with no \{
Expressions inside callouts, lists, tables, headings, and export components
New suites: __tests__/lib/micromark/mdx-expression-lenient*.test.ts (incl. ReDoS vectors)

📸 Screenshot or Loom

N/A — parser-only change, covered by tests.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-07-02T09:51:10Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 5fe75297-ca4a-4e1f-8bc8-ef8c2137d48c

📥 Commits

Reviewing files that changed from the base of the PR and between a54d73d and 65afe74.

📒 Files selected for processing (1)

processor/transform/mdxish/remove-jsx-comments.ts

🔗 Linked repositories identified

CodeRabbit considers these linked repositories for cross-repo context during reviews:

readmeio/ai (manual)
readmeio/gitto (manual)
readmeio/markdown (manual)
readmeio/readme (manual)

🚧 Files skipped from review as they are similar to previous changes (1)

processor/transform/mdxish/remove-jsx-comments.ts

Walkthrough

This PR replaces brace-escaping preprocessing with a lenient MDX text-expression tokenizer that returns literal text for unbalanced {...} input. mdxish.ts now uses mdxExpressionLenient and a separate removeJSXComments module. The old preprocessing implementation is removed, and new Vitest coverage adds tokenizer, rendering, JSX comment, exported-declaration ternary, and ReDoS resistance tests.

Changes

Tokenizer: added mdxExpressionLenient and tokenizeTextExpression
Wiring: updated lib/mdxish.ts to use the new tokenizer and standalone JSX comment remover
Tests: added new tokenizer/rendering/ReDoS suites and exported-declaration ternary cases
Removal: deleted the old brace-escaping preprocessing module and its related tests

Sequence Diagram(s)

sequenceDiagram
  participant mdxishAstProcessor
  participant mdxExpressionLenient
  participant tokenizeTextExpression
  participant removeJSXComments

  mdxishAstProcessor->>removeJSXComments: strip JSX comments
  mdxishAstProcessor->>mdxExpressionLenient: build text-only expression extension
  mdxExpressionLenient->>tokenizeTextExpression: register tokenizer for '{'
  tokenizeTextExpression-->>mdxExpressionLenient: emit mdxTextExpression or nok

Related issues: Not specified.

Related PRs: Not specified.

Suggested labels: tests, parser, refactor

Suggested reviewers: Not specified.

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@processor/transform/mdxish/remove-jsx-comments.ts`:
- Around line 3-13: The JSDoc example for removeJSXComments is inconsistent with
JSX_COMMENT_REGEX: it currently shows spaced delimiters that would not match.
Update the example in removeJSXComments so the sample string uses the exact {/*
... */} form with no spaces around the braces, and keep the documented return
value aligned with the actual behavior of the regex.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: b7d4ef3b-27f9-41dc-91eb-c8c13dbc46bc

📥 Commits

Reviewing files that changed from the base of the PR and between 945fbf9 and f43c181.

📒 Files selected for processing (10)

__tests__/lib/mdxish/exports.test.ts
__tests__/lib/micromark/mdx-expression-lenient-redos.test.ts
__tests__/lib/micromark/mdx-expression-lenient.test.ts
__tests__/transformers/preprocess-jsx-expressions.test.ts
__tests__/transformers/preprocess-redos-attack.test.ts
lib/mdxish.ts
lib/micromark/mdx-expression-lenient/index.ts
lib/micromark/mdx-expression-lenient/syntax.ts
processor/transform/mdxish/preprocess-jsx-expressions.ts
processor/transform/mdxish/remove-jsx-comments.ts

🔗 Linked repositories identified

CodeRabbit considers these linked repositories for cross-repo context during reviews:

readmeio/ai (manual)
readmeio/gitto (manual)
readmeio/markdown (manual)
readmeio/readme (manual)

💤 Files with no reviewable changes (3)

tests/transformers/preprocess-jsx-expressions.test.ts
processor/transform/mdxish/preprocess-jsx-expressions.ts
tests/transformers/preprocess-redos-attack.test.ts

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

…k-after-mdxish-revert

## Version 14.11.0 ### ✨ New & Improved * **mdxish:** replace brace-escaping preprocessing with lenient expression tokenizer ([#1531](#1531)) ([5c5a57c](5c5a57c)) * **mdxish:** support library imports in declarations ([#1530](#1530)) ([fa6ee97](fa6ee97)) ### 🛠 Fixes & Updates * correct exports map ordering ([#1529](#1529)) ([08a63dd](08a63dd))

rafegoldberg · 2026-07-02T14:13:30Z

This PR was released!

🚀 Changes included in v14.11.0

eaglethrost and others added 3 commits July 2, 2026 15:39

fix: braces expressions balance

df6108e

refactor: recreate expression tokenizer

d9b08a3

chore: tidy tokenizer doc comment and test formatting

f43c181

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

coderabbitai Bot requested changes Jul 2, 2026

View reviewed changes

Comment thread processor/transform/mdxish/remove-jsx-comments.ts

Update processor/transform/mdxish/remove-jsx-comments.ts

a1b53b7

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

coderabbitai Bot previously approved these changes Jul 2, 2026

View reviewed changes

kevinports and others added 2 commits July 2, 2026 08:38

Merge branch 'next' into dimas/rm-17232-pages-with-react-scripts-brea…

a54d73d

…k-after-mdxish-revert

fix: broken JSDoc comment

65afe74

kevinports dismissed coderabbitai[bot]’s stale review via 65afe74 July 2, 2026 13:52

coderabbitai Bot approved these changes Jul 2, 2026

View reviewed changes

kevinports merged commit 5c5a57c into next Jul 2, 2026
8 checks passed

kevinports deleted the dimas/rm-17232-pages-with-react-scripts-break-after-mdxish-revert branch July 2, 2026 14:06

rafegoldberg added the released label Jul 2, 2026

kevinports mentioned this pull request Jul 2, 2026

fix(mdxish): preserve <style> blocks, style objects, and .map() JSX carried over from MDX #1532

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

refactor(mdxish): replace brace-escaping preprocessing with lenient expression tokenizer#1531

refactor(mdxish): replace brace-escaping preprocessing with lenient expression tokenizer#1531
kevinports merged 6 commits into
nextfrom
dimas/rm-17232-pages-with-react-scripts-break-after-mdxish-revert

eaglethrost commented Jul 2, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Jul 2, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

rafegoldberg commented Jul 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

eaglethrost commented Jul 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🎯 What does this PR do?

🧪 QA tips

📸 Screenshot or Loom

Uh oh!

coderabbitai Bot commented Jul 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

rafegoldberg commented Jul 2, 2026

This PR was released!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

eaglethrost commented Jul 2, 2026 •

edited

Loading

coderabbitai Bot commented Jul 2, 2026 •

edited

Loading