feat: add multimodal UIMessage support #230

jakobhoeg · 2026-01-17T10:01:12Z

🎯 Changes

When calling append() with a ModelMessage containing multimodal content (images, audio, files), the content was stripped during the ModelMessage → UIMessage conversion because modelMessageToUIMessage() only extracted text via getTextContent(). Along this, the parts of a message doesn't include multimodal parts, making it impossible to build chat UIs that preserve and display multimodal content.

Added new message part types and updated the conversion functions to preserve multimodal content during round-trips:
New Types (@tanstack/ai and @tanstack/ai-client):

ImageMessagePart - preserves image data with source and optional metadata
AudioMessagePart - preserves audio data
VideoMessagePart - preserves video data - (NOT TESTED)
DocumentMessagePart - preserves document data (e.g., PDFs) - (NOT TESTED)

Updated Conversion Functions:

modelMessageToUIMessage() - now converts ContentPart[] to corresponding MessagePart[] instead of discarding non-text parts
uiMessageToModelMessages() - now builds ContentPart[] when multimodal parts are present, preserving part ordering

Example:

// Input ModelMessage with multimodal content
const message: ModelMessage = {
  role: 'user',
  content: [
    { type: 'text', text: 'What is in this image?' },
    { type: 'image', source: { type: 'url', value: '' } }
  ]
}

// UIMessage now preserves all content
const uiMessage = modelMessageToUIMessage(message)
// uiMessage.parts = [
//   { type: 'text', content: 'What is in this image?' },
//   { type: 'image', source: { type: 'url', value: '' } }
// ]

// UI
if (part.type === 'image') { // 'audio' etc.
  ...<Render UI />
}

Demo

Images:
https://github.com/user-attachments/assets/5f62ab32-9f11-44f7-bfc0-87d00678e265

Audio:
https://github.com/user-attachments/assets/bbbdc2f9-f8d7-4d74-99c2-23d15a3278a3

Closes #200

Note

I have not tested this with other adapters than my own community adapter that I'm currently working on.

This contribution touches core message handling. Let me know if the approach doesn't align with the project's vision, I am happy to iterate on it :)

This PR is not ready to be merged because:

Video and document parts are implemented but not yet tested
Only tested with my community adapter - needs verification with official adapters (OpenAI, Anthropic, etc.)

✅ Checklist

I have followed the steps in the Contributing guide.
- I followed CLAUDE.md, since the link is broken.
I have tested this code locally with pnpm run test:pr.

🚀 Release Impact

This change affects published code, and I have generated a changeset.
This change is docs/CI/dev-only (no release).

Summary by CodeRabbit

New Features
- Added multimodal message support, enabling images, audio, video, and documents to be included alongside text in messages.
Tests
- Added tests validating multimodal message conversions, content preservation, and metadata handling.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2026-01-17T10:01:17Z

📝 Walkthrough

Walkthrough

This PR adds multimodal message part support to TanStack AI, introducing four new message part types (image, audio, video, document) with corresponding type definitions and conversion logic that preserves multimodal content when transforming between UIMessage and ModelMessage representations.

Changes

Cohort / File(s)	Change Summary
Changeset Documentation `.changeset/brave-nights-shout.md`	New changeset entry documenting patch version bumps for `@tanstack/ai-client` and `@tanstack/ai` with multimodal UIMessage feature flag
Type Exports `packages/typescript/ai-client/src/index.ts`	Added exports for four multimodal message part types (ImageMessagePart, AudioMessagePart, VideoMessagePart, DocumentMessagePart) to public API
Client Type Definitions `packages/typescript/ai-client/src/types.ts`	Defined four new interfaces for multimodal content parts with generic metadata support; extended MessagePart union to include these types
Core Type Definitions `packages/typescript/ai/src/types.ts`	Mirrored multimodal message part interface definitions with ContentPartSource integration; updated MessagePart union type
Message Conversion Logic `packages/typescript/ai/src/activities/chat/messages.ts`	Implemented contentPartsToMessageParts converter; enhanced uiMessageToModelMessages to detect and preserve multimodal content as ContentPart arrays; updated modelMessageToUIMessage to reconstruct multimodal parts from ContentPart arrays
Conversion Tests `packages/typescript/ai/tests/messages.test.ts`	Added 190 lines of test coverage for bidirectional UIMessage ↔ ModelMessage conversion, validating multimodal content preservation, metadata handling, and round-trip consistency

Sequence Diagram

sequenceDiagram
    participant Client as Client (useChat)
    participant UIMsg as UIMessage Builder
    participant Converter as Message Converter
    participant Model as ModelMessage

    rect rgba(100, 200, 150, 0.5)
    Note over Client,Model: Append Multimodal Content
    Client->>UIMsg: append({ role, content: [text, image, ...] })
    UIMsg->>Converter: uiMessageToModelMessages(uiMessage)
    activate Converter
    Converter->>Converter: Detect multimodal parts (image, audio, video, doc)
    Converter->>Model: Create ModelMessage with ContentPart[]
    deactivate Converter
    Model->>Model: content: [TextPart, ImagePart, ...] (preserved)
    end

    rect rgba(150, 150, 200, 0.5)
    Note over Model,UIMsg: Receive Multimodal Response
    Model->>Converter: modelMessageToUIMessage(modelMessage)
    activate Converter
    Converter->>Converter: Detect ContentPart[] in content
    Converter->>Converter: contentPartsToMessageParts(parts[])
    deactivate Converter
    Converter->>UIMsg: Reconstruct UIMessage with multimodal parts
    UIMsg->>Client: Update chat with [text, image, audio, video, doc] parts
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~30 minutes

Suggested reviewers

AlemTuzlak
harry-whorlow

Poem

🐰 Hops of joy! Four new types now dance,
Image, audio, video—give content a chance!
Conversion logic flows both ways with grace,
Multimodal messages found their place! 📸🎵🎬

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely summarizes the main change: adding multimodal support to UIMessage for images, audio, video, and documents.
Linked Issues check	✅ Passed	The PR fully addresses issue `#200` by implementing multimodal ContentPart preservation in UIMessage.parts and conversion functions, enabling adapters to receive complete multimodal content.
Out of Scope Changes check	✅ Passed	All changes are directly scoped to implementing multimodal message support: new types, conversion logic updates, tests, and export additions align with issue `#200` requirements.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description check	✅ Passed	The PR description comprehensively explains the problem, solution, implementation details, testing status, and provides clear examples.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

jakobhoeg · 2026-01-17T10:02:14Z

@coderabbitai review

coderabbitai · 2026-01-17T10:02:21Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

jakobhoeg added 5 commits January 17, 2026 10:35

tests: add initial tests

2f060e1

feat: handle multimodal types of input

afac061

add types to tanstack-client

0b9fca5

chore: prettier format

3d6bd0d

chore: changeset

06bc585

jakobhoeg marked this pull request as ready for review January 17, 2026 10:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

feat: add multimodal UIMessage support #230

feat: add multimodal UIMessage support #230

jakobhoeg commented Jan 17, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Jan 17, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Suggested reviewers

Poem

Review ran into problems

Uh oh!

jakobhoeg commented Jan 17, 2026

Uh oh!

coderabbitai bot commented Jan 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

feat: add multimodal UIMessage support #230

Are you sure you want to change the base?

feat: add multimodal UIMessage support #230

Conversation

jakobhoeg commented Jan 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🎯 Changes

Example:

Demo

Note

✅ Checklist

🚀 Release Impact

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Suggested reviewers

Poem

Review ran into problems

Uh oh!

jakobhoeg commented Jan 17, 2026

Uh oh!

coderabbitai bot commented Jan 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jakobhoeg commented Jan 17, 2026 •

edited

Loading

coderabbitai bot commented Jan 17, 2026 •

edited

Loading