This project is an AI-assisted writing system based on single-agent and multi-agent workflows: WenCe AI (Word Agent). After users install the add-in in office suites (such as WPS and Microsoft Word), they can interact with AI in natural language to get writing suggestions, content generation, and structure optimization.
WenCe AI (Word Agent): strategy-driven writing, smarter expression.
The backend is built with FastAPI, and the frontend WPS add-in communicates with the backend through streaming APIs, so users can see LLM outputs in real time for a seamless writing-assistant experience.
The frontend is developed with Vue 3 and JavaScript. A key module is the DocxJson bidirectional converter, which transforms formatted Word document content and JSON structures into each other.
The backend is implemented in Python, using LangChain and LangGraph for agent design and collaboration, ChatOpenAI-compatible APIs for SSE streaming and tool invocation, and a lightweight PySide6 desktop panel for add-in installation and log viewing.
At its core, this project focuses on structured Word document generation. The project defines a JSON schema conceptually similar to HTML and CSS, abstracting paragraph and text-run styles to help agents understand and generate well-formatted documents.
Main JSON data structures:
- paragraphs: an array of Word paragraphs containing multiple runs; this is the primary editable object for the agent
- pStyle: paragraph style ID (for example, Heading 1, Heading 2, Body)
- runs: text-run array, the smallest content unit in this project
- text: text content
- rStyle: character style ID (for example, bold, red)
- paraIndex: paragraph index so the agent can locate and edit a specific paragraph precisely
- styles: style definition dictionary containing all paragraph and character style definitions; the agent references these style IDs when generating content
Compared with common AI writing tools, WenCe AI provides:
- Cross-version and cross-platform compatibility: built on mainstream office software with a Copilot-like add-in UX, lowering the adoption barrier for regular users, with support for both Windows and Linux.
- Native rich-text editing with style and paragraph awareness: unlike many Word AI assistants, this project understands Word structure, can collect web information autonomously, and can modify both structure and content according to user requirements.
- Efficient editing with multi-agent collaboration: multiple agents play different expert roles and collaborate to produce deeper long-form writing.
- Open and flexible integration: users provide their own API keys and can choose from mainstream LLM providers and models.
| WPS Add-in UI | Backend Qt UI |
|---|---|
![]() |
![]() |
For example, in WPS single-agent mode, a user asks: "Expand my internship objective into five points." The agent first calls search_document to locate the target paragraph, then calls read_document to read it, analyzes the content, calls delete_document to remove the original text, and finally calls generate_document to produce the rewritten result. The frontend add-in renders before/after changes with different highlight colors so users can clearly see what the agent modified.
The generated article conforms to Word document structure and formatting. While generating text, the agent also returns style metadata (for example titles, body text, bold, fonts, indentation, and line spacing). The frontend uses these style definitions to render properly formatted Word content.
In addition, this project supports custom tool integration through MCP server configuration, allowing agents to call third-party APIs. Using Tavily Search MCP and a Visualization Chart MCP Server as an example: when a user asks for today's Changsha temperature and a one-week temperature table, the agent can call Tavily MCP to fetch weather data, then call generate_document to create table content and return it to the frontend for rendering. If the user then asks, "Based on this table, draw a temperature bar chart," the agent calls read_document to understand the table and then calls the chart MCP server to generate an image URL, which is rendered in the add-in chat panel.
- Single-agent mode
- Multi-agent mode
- Remote MCP server integration
- Local MCP server and Skill tool integration
- Advanced style editing (tables, illustrations, equations, etc.)
- WPS Office (Windows, Linux), version 12.1.25225 and above
- Microsoft Word (Windows, Web), version 2019/2021 and above
To better satisfy user needs and improve generation stability and depth, the project provides two agent architectures.
The frontend WPS add-in converts user requests and selected paragraph ranges into structured JSON and sends them to the backend.
In the backend single-agent architecture, the system follows a standard ReAct loop. In each round, the agent reasons over the user input and current document state, chooses whether to call a tool (for example, web search) or end directly, then reasons again based on tool results and continues until completion.
- read_document tool: reads content within
(startPosition, endPosition)and returns structured JSON to the agent. - generate_document tool: generates structured JSON content and returns it to the frontend add-in.
- search_document tool: locates paragraphs by format or text criteria and returns positions to the agent.
- web_fetch tool: fetches information from websites provided by the user.
The frontend flow is the same as in single-agent mode. In the backend multi-agent workflow, a planner agent orchestrates and schedules multiple specialized agents.
- research agent: collects external references
- outline agent: generates an article outline from references and requirements
- writer agent: writes content based on references and requirements
- reviewer agent: reviews generated content and provides revision suggestions
- Node v22.12.0
- wpsjs 2.2.3
- Python 3.11.14
- Windows 10/11 or Ubuntu 22.04
cd frontend/wps_word_plugin # WPS Word add-in
cd frontend/microsoft_word_plugin # or Microsoft Word add-in
pnpm install
pnpm buildcd backend
uv run python main.pyThe project also supports LangSmith for tracing and analyzing agent behavior. See backend/README.md for configuration details.
cd backend/deploy
uv run pyinstaller wence.specThe packaged executable is generated in backend/deploy/dist.
If you do not want to package it yourself, you can download prebuilt archives from Releases and run the executable directly.
Packaged release files: Release.
After downloading, extract and run the executable. Start the backend service (wence_word_plugin -> Install), open Word, trust the add-in, and start using the system.
You need to configure an LLM API. This project is currently tested with Alibaba Bailian Qwen3.5-Plus APIs.
The project has tested part of the mainstream LLM APIs in China, and compatibility is still being expanded:
- Qwen 3.5 Plus (stable)
- Qwen3 Max (stable)
- GLM-5 (stable)
- GPT 5.4 (stable)
- MiniMax M2.5 (stable)
- Step 3.5 Flash (stable)
- DeepSeek v3.2 (stable)
- Claude Sonnet 4.6 (issues with document-generation tool calls)
- Kimi K2.5 (can fall into tool-call loops)
- Gemini 3.1 Pro
Note: part of development used free quotas from Alibaba Bailian and OpenRouter.
Contact: https://cmcblog.netlify.app/about/
This project is released under the Apache License 2.0.







