Skip to content

fix(mdxish): <HTMLBlocks> inside <Table> not rendering#1484

Merged
eaglethrost merged 17 commits into
nextfrom
dimas/rm-16726-htmlblock-not-rendering-in-tables
Jun 4, 2026
Merged

fix(mdxish): <HTMLBlocks> inside <Table> not rendering#1484
eaglethrost merged 17 commits into
nextfrom
dimas/rm-16726-htmlblock-not-rendering-in-tables

Conversation

@eaglethrost

@eaglethrost eaglethrost commented May 25, 2026

Copy link
Copy Markdown
Contributor
🎫 Resolve RM-16726

🎯 What does this PR do?

To try to fix an issue where <HTMLBlock> is not rendering inside JSX <Table>, this PR makes substantial changes to how we parse HTMLBlocks syntax by moving away from the string-level content protection we've been doing and reusing the existing MDX tokenizer for it.

Root cause of rendering issue: We have a preprocessing step in the pipeline where HTMLBlock bodies encoded into an HTML-comment marker (<!--RDMX_HTMLBLOCK:…-->) in preprocessJSXExpressions, then decoded back further down the pipeline to be transformed to HTMLBlock nodes. When the <HTMLBlock> is inside a <Table>, the table transformer which still has the encoded HTMLBlock fails to parse it since it uses remarkMdx which turns out rejects HTML comments, making the table never parsed. The blocks were encoded because we didn't want its content to be modified by other preprocessing steps & it's usage of the curly braces could cause expression parsing issues.

Approach: We now actually can stop protecting and decoding. Now that the mdxComponent tokenizer can capture component bodies, including multiline {} template literals, thanks to the brace-aware body states added in #1455, we can now let the tokenizer claim <HTMLBlock> and read its body straight from the parsed template-literal expression. No marker round-trip, no comment for remarkMdx to choke on. (This is the same direction as @maximilianfalco's HTMLBlock-tokenizer work in #1439.)

What changed:

  • Tokenizer claims <HTMLBlock>. Split the exclusion set so the micromark mdxComponent construct captures <HTMLBlock> (new TOKENIZER_MDX_COMPONENT_EXCLUDED_TAGS), while the remark string-reparse transforms still leave it alone — re-parsing it there is what would mangle bodies containing unbalanced-looking braces.
  • Adjust the html block transformer(mdxish-html-blocks.ts) Now the transformer deals with different input data to extract:
    1. JSX element (mdxJsxFlowElement/mdxJsxTextElement) — block context (e.g. <Callout>) and table cells (after their remarkMdx re-parse);
    2. Raw HTML blob — single-line top-level, or nested in raw HTML like an inline <div> (CommonMark slurps these whole, so we split them back out);
    3. Inline-in-paragraph<HTMLBlock> open/close arriving as separate siblings around the expression.
  • mdxish-tables keeps a table as a JSX <Table> when a cell contains an <HTMLBlock> (block-level content a GFM cell can't represent).
  • Removed the marker machinery entirely: protectHTMLBlockContent + the RDMX_HTMLBLOCK markers, the base64 encode/decode paths, and the table-specific comment-neutralization workaround. HTMLBlock handling collapses from four locations down to one.

🧪 QA tips

  • Render an <HTMLBlock> inside a <Table> cell and confirm the HTML renders without breaking the table, and sibling cells still get markdown:
    <Table>
      <tbody>
        <tr>
          <td>**bold** still works</td>
          <td><HTMLBlock>{`<div style="color: red;">Hello</div>`}</HTMLBlock></td>
        </tr>
      </tbody>
    </Table>
  • Confirm safeMode/runScripts survive, and multiple HTMLBlocks in one table all render.
  • Confirm top-level <HTMLBlock> and <HTMLBlock> in a generic <div> still render as before.
  • New coverage added in __tests__/lib/mdxish/html-blocks.test.ts.

Demo (before & after):

Screen.Recording.2026-05-25.at.7.32.39.pm.mov

@eaglethrost eaglethrost changed the title fix: render HTMLBlocks nested inside JSX blocks (tables) fix(mdxish): <HTMLBlocks> inside <Table> not rendering May 25, 2026
Comment thread processor/transform/mdxish/rehype-html-blocks-in-jsx.ts Fixed
@eaglethrost eaglethrost marked this pull request as ready for review May 25, 2026 09:36
@eaglethrost eaglethrost requested a review from kevinports May 26, 2026 12:21

@kevinports kevinports left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like most of the complexity in this PR is dealing with the marker that protectHTMLBlockContent adds.

But we recently merged #1455 which shields multi-line template literal content in component bodies from html parsing. I wonder if it's possible to just remove protectHTMLBlockContent to significantly simplify everything here? I definitely didn't consider the HTMLBlock use case when working on #1455 and never tried removing that pre-processor.

@maximilianfalco

Copy link
Copy Markdown
Contributor

i have this work in #1439 to create a tokenizer for our HTMLBlock? maybe that can help with eliminating the need to protect html block content altogether? @eaglethrost @kevinports

@eaglethrost

eaglethrost commented May 27, 2026

Copy link
Copy Markdown
Contributor Author

Yeah @kevinports @maximilianfalco I've had a rethink of the approach and I find that we can reuse both Kevin's work in #1455 and falco's tokenizer work in #1439:

  • fix(mdxish): terminal custom component breaks page rendering #1455 actually allows to reuse the MDX tokenizer to tokenise <HTMLBlock>, so we don't fully need the lone tokenizer. The main thing is as Kevin mentioned, we won't need to protect the html blocks & solves the fundamental issue of the block not rendering in tables
  • This means we can also simplify the html block transformer
  • I think it's worth combine both of your works so we can get the refactor + fixing this original ticket
  • I also found a bug when integrating to the main app where the content inside the HTMLBlock placed inside a table gets unexpectedly indented in serialisation. This could be an issue in the editor serialization

Will move this to draft first to consolidate the combined logic & investigating the bug

@eaglethrost eaglethrost marked this pull request as ready for review May 28, 2026 08:42
@eaglethrost eaglethrost marked this pull request as draft May 28, 2026 12:26
@eaglethrost eaglethrost marked this pull request as ready for review May 28, 2026 13:06
@eaglethrost

Copy link
Copy Markdown
Contributor Author

Having this fix working & now allowing html blocks inside table uncovered a a bug where the block content in the table gets indented in the editor, but actually renders fine in view mode. Interestingly this also happens in the old editor in MDX, so it doesn't look like an issue from this PR specifically and would be a separate fix I'll investigate.

Demo of this in old editor & MDX project, notice how the block content gets indented after round trips:

Screen.Recording.2026-05-28.at.11.10.10.pm.mov

It looks like it's taking the indented space literally in the deserialization, might be an editor side fix required.

@kevinports kevinports left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very glad to drop all the preprocessing with this revised approach. Lgtm 👍

eaglethrost and others added 2 commits June 1, 2026 18:07
| 🎫 Resolve ISSUE_ID |
| :-----------------: |

## 🎯 What does this PR do?

While fixing HTMLBlocks not rendering inside Tables in mdxish, I noticed
that once it worked, an editor round trip would unexpectedly indent the
block content lines in the editor, even though the rendering is fine &
not affected. See this demo:


https://github.com/user-attachments/assets/5b3fa862-b493-417b-b48f-fac82650133b

The root issue is actually the HTMLBlock transformer in the engine
captures the content verbatim from the source, each
content line's leading whitespace is exactly the characters that sit
between the backticks, measured from column 1. Since the Table content
is indented in serialisation, the leading whitespaces exist.

The fix I went for here is in the block content extraction code, we pass
in the `<HTMLBlock>` opening tag position & deindent each line relative
to that, instead of the starting column. I think it makes sense to use
the tag as the anchor column, there's a few ways we can decide that. I
also think the fix should be the engine side cause I don't think it
should capture the content verbatim anyway (briefly considered putting
fix in the editor).

Note that this happens in MDX as well. Haven't investigated yet but it's
likely it's an engine issue as well and not the editor.

## 🧪 QA tips

<!-- Unique code decisions, code walkthroughs, how to test them -->

The fix deindents each `<HTMLBlock>` content line **relative to the
opening tag's column**, not the start of the line. To verify, paste each
example into the mdxish editor, confirm it renders correctly, then do an
editor round-trip (e.g. view as Markdown and reopen) — the content lines
should **not** gain extra leading indentation.

- [ ] **Indented `<HTMLBlock>` (nested under a list item)**

  ````md
  1. Here is some custom HTML:

     <HTMLBlock>{`
     <div style="color: red;">
       <p>Hello</p>
       <p>World</p>
     </div>
     `}</HTMLBlock>
  ````

The extracted content should be deindented relative to the `<HTMLBlock>`
tag, so the `<div>` sits at column 0 and the `<p>`s keep their relative
2-space indent:

  ```html
  <div style="color: red;">
    <p>Hello</p>
    <p>World</p>
  </div>
  ```

Before the fix, every round-trip would keep the list's 3-space
indentation on each line (and compound it on repeated trips).

- [ ] **`<HTMLBlock>` inside a `<Table>` cell**

  ````md
  <Table>
    <thead>
      <tr><th>Name</th><th>Markup</th></tr>
    </thead>
    <tbody>
      <tr>
        <td>Custom</td>
        <td><HTMLBlock>{`<div style="color: red;">
    <p>Hello</p>
    <p>World</p>
  </div>`}</HTMLBlock></td>
      </tr>
    </tbody>
  </Table>
  ````

The table should stay a JSX `<Table>` and the cell should render the raw
HTML. The extracted content should preserve the author's relative
indentation without the table-cell serialization indentation leaking
into the lines:

  ```html
  <div style="color: red;">
    <p>Hello</p>
    <p>World</p>
  </div>
  ```

## 📸 Screenshot or Loom

Demo of block inside Table where the indents are retained:


https://github.com/user-attachments/assets/68178bd0-0d44-4ebc-8dbb-86be1b2fad8a
@eaglethrost eaglethrost merged commit 3817fa1 into next Jun 4, 2026
8 checks passed
@eaglethrost eaglethrost deleted the dimas/rm-16726-htmlblock-not-rendering-in-tables branch June 4, 2026 08:27
rafegoldberg pushed a commit that referenced this pull request Jun 4, 2026
## Version 14.7.0
### ✨ New & Improved

* **images:** allow non centered images to have caption ([#1502](#1502)) ([15616ea](15616ea))

### 🛠 Fixes & Updates

* **mdxish:** <HTMLBlocks> inside <Table> not rendering ([#1484](#1484)) ([3817fa1](3817fa1)), closes [#1455](#1455)
* **mdxish:** normalize spacing for blank-line-split table tags ([#1493](#1493)) ([f162158](f162158))

<!--SKIP CI-->
@rafegoldberg

Copy link
Copy Markdown
Collaborator

This PR was released!

🚀 Changes included in v14.7.0

jamestclark added a commit that referenced this pull request Jun 25, 2026
Addresses PR review requests for more table-rendering coverage and a
callout example, each fixture grounded in a merged bug fix:

- jsx-table-multiline-cells (#1445) — multi-paragraph cells preserved
- jsx-table-unclosed-cells (#1465) — asymmetric/unclosed cell tags recovered
- table-unwrapped-rows (#1458, #1411) — rows missing <tr>/<tbody> wrappers
- htmlblock-in-table (#1484) — <HTMLBlock> inside a <Table> cell
- legacy-vars-in-table (#1458) — legacy <<vars>> in raw table cells
- callout-icons (#1498) — blockquote + FA-class-icon callout render

Also refreshes the divergent, htmlblock-with-script, and
jsx-attribute-entities snapshots to reflect engine output changes pulled
in from the origin/next merge (invalid <p> wrappers removed around block
elements; <figcaption> now a direct child of <figure>).

Claude-Session: https://claude.ai/code/session_01GPTShf49qTsVP1AxSbpRJk
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants