fix: 修复附件 markdown 读取链路与生产部署挂载 by safesummer · Pull Request #647 · xerrors/Yuxi

safesummer · 2026-04-17T13:38:52Z

修复聊天附件 markdown 读取失败的问题，并补充生产部署说明。

问题现象：

上传 docx 后会生成 attachments/*.md
前端文件树能看到该 md
但智能体后续 read_file('/home/gem/user-data/uploads/attachments/*.md') 可能返回 404，导致无法继续处理附件内容

本次修复包含：

将上传附件转换出的 markdown 同步写入 LangGraph files/uploads 状态
补充附件状态同步与读取相关单元测试、集成测试
修复生产环境中 sandbox provisioner 挂载错误 saves 目录导致的附件不可见问题
补充生产部署、升级校准和版本记录文档

已验证：

正式部署环境下 docx 上传后能生成 markdown
智能体可成功读取 /home/gem/user-data/uploads/attachments/*.md
非流式真实聊天请求已返回 finished 并基于附件内容生成回复

xerrors · 2026-04-18T05:05:01Z

感谢 PR，我先让 copilot 帮忙 review 一下，它给的意见供你参考

Copilot

Pull request overview

该 PR 旨在修复“聊天附件转换出的 Markdown 在智能体链路中读取不稳定/404”的问题，通过把解析后的 Markdown 内容同步写入 LangGraph state（files/uploads），并补齐相关测试与生产部署文档说明。

Changes:

附件上传解析后，将 uploads 以及解析出的 Markdown 内容（files）写入 LangGraph state，提升 read_file 可用性
更新 agent state 提取逻辑与单元/集成测试，覆盖“上传 office -> 转 Markdown -> state 可读”的链路
补充生产部署校准、验证与升级指引；调整 API 镜像构建阶段 apt 下载重试配置

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
docs/develop-guides/roadmap.md	记录新增的已知 bug/修复项条目
docs/develop-guides/changelog.md	增加 v0.6.1 修复说明（附件 state 同步、挂载、测试）
docs/advanced/deployment.md	增补生产环境变量注意事项、默认智能体模型校准、验证与升级步骤
docker/api.Dockerfile	apt 源协议切换到 https 并增加重试参数以降低构建失败率
backend/test/unit/services/test_conversation_service_attachment_state.py	单测覆盖附件 state 同步时写入 `files/uploads`
backend/test/unit/services/test_chat_service_sync.py	单测更新 `extract_agent_state` 输出包含 `uploads` 并验证保留行为
backend/test/integration/api/test_chat_router.py	集成测试覆盖 docx 上传后 state `files` 可读与内容接口可读
backend/package/yuxi/services/conversation_service.py	新增/写入 `state.files`，并在同步时同时写入 `uploads/files`
backend/package/yuxi/services/chat_service.py	`extract_agent_state` 输出结构补齐 `uploads`
backend/package/yuxi/agents/state.py	LangGraph state schema 增加 `files/uploads` 字段及合并策略

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-18T05:08:30Z

+
+def merge_uploads(existing: list[dict] | None, new: list[dict] | None) -> list[dict]:
+    """Replace uploads with the latest normalized snapshot."""
+    return list(new or existing or [])


merge_uploads() uses new or existing, so an explicit empty list from graph.aupdate_state(values={"uploads": []}) will be treated as falsy and the old uploads will be retained. This breaks the “latest snapshot” semantics and means deleting all attachments won’t clear state.uploads. Consider returning new whenever it is not None (even if empty), and only falling back to existing when new is None.

Suggested change

return list(new or existing or [])

if new is not None:

return list(new)

if existing is not None:

return list(existing)

return []

Copilot · 2026-04-18T05:08:31Z

+    """Merge state file dictionaries with latest values winning."""
+    if existing is None:
+        return new or {}
+    if new is None:
+        return existing
+    return existing | new


merge_files() merges dicts with existing | new, which cannot remove keys. When attachments are deleted, conversation_service._sync_thread_upload_state() will send an empty {} for files, but the old file entries will remain in state. If files is intended to represent the current attachment snapshot, consider replacing with new when provided (even if empty), or implement a deletion/removal mechanism.

Suggested change

"""Merge state file dictionaries with latest values winning."""

if existing is None:

return new or {}

if new is None:

return existing

return existing | new

"""Replace state file dictionaries with the latest snapshot when provided."""

if new is None:

return existing or {}

return dict(new or {})

Copilot · 2026-04-18T05:08:31Z

+        if attachment.get("status") != "parsed":
+            continue
+
+        file_path = attachment.get("file_path")
+        markdown = attachment.get("markdown")
+        if not isinstance(file_path, str) or not file_path.strip():
+            continue
+        if not isinstance(markdown, str) or not markdown:


_build_state_files() only uses attachment["file_path"] as the key. Existing attachment records (or future records) may only have path set (which is already validated in _build_state_uploads()), causing parsed attachments to be skipped and not materialized into state.files. Consider falling back to attachment.get("path") when file_path is missing/blank for better backward compatibility.

Copilot · 2026-04-18T05:08:31Z

+    files: dict[str, dict] = {}
+    for attachment in attachments:
+        if attachment.get("status") != "parsed":
+            continue
+
+        file_path = attachment.get("file_path")
+        markdown = attachment.get("markdown")
+        if not isinstance(file_path, str) or not file_path.strip():
+            continue
+        if not isinstance(markdown, str) or not markdown:
+            continue
+
+        uploaded_at = attachment.get("uploaded_at") or utc_isoformat()
+        files[file_path] = {
+            "content": markdown.splitlines(),
+            "created_at": uploaded_at,
+            "modified_at": uploaded_at,
+        }
+
+    return files
+
+


There are now two very similar helpers for building state.files: yuxi.services.chat_service._build_state_files() and this new conversation_service._build_state_files(). Keeping both risks behavioral drift (e.g., splitlines vs split("\n"), timestamp defaults, path validation). Consider consolidating into a single shared helper (or importing one from the other) to ensure consistent state materialization.

Suggested change

files: dict[str, dict] = {}

for attachment in attachments:

if attachment.get("status") != "parsed":

continue

file_path = attachment.get("file_path")

markdown = attachment.get("markdown")

if not isinstance(file_path, str) or not file_path.strip():

continue

if not isinstance(markdown, str) or not markdown:

continue

uploaded_at = attachment.get("uploaded_at") or utc_isoformat()

files[file_path] = {

"content": markdown.splitlines(),

"created_at": uploaded_at,

"modified_at": uploaded_at,

}

return files

from yuxi.services.chat_service import _build_state_files as _shared_build_state_files

return _shared_build_state_files(attachments)

fix: 修复附件 markdown 读取链路与生产部署挂载

ea8b819

xerrors requested a review from Copilot April 18, 2026 05:04

Copilot started reviewing on behalf of xerrors April 18, 2026 05:04 View session

Copilot AI reviewed Apr 18, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: 修复附件 markdown 读取链路与生产部署挂载#647

fix: 修复附件 markdown 读取链路与生产部署挂载#647
safesummer wants to merge 1 commit intoxerrors:mainfrom
safesummer:fix/attachment-sandbox-sync

safesummer commented Apr 17, 2026

Uh oh!

xerrors commented Apr 18, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 18, 2026

Uh oh!

Copilot AI Apr 18, 2026

Uh oh!

Copilot AI Apr 18, 2026

Uh oh!

Copilot AI Apr 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

-    return list(new or existing or [])
+    if new is not None:
+        return list(new)
+    if existing is not None:
+        return list(existing)
+    return []

Conversation

safesummer commented Apr 17, 2026

Uh oh!

xerrors commented Apr 18, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants