Word Agent is an AI-assisted writing system (single-agent and multi-agent): WenCe AI. After installing the add-in in office suites (such as WPS and Microsoft Word), users can interact with AI through natural language to get writing suggestions, content generation, and structure optimization.
WenCe AI (Word Agent): strategy-driven writing, smarter expression.
The backend is built with FastAPI. The frontend add-ins communicate with the backend through streaming interfaces so users can see LLM outputs in real time and get a seamless writing-assistant experience.
The frontend is built with Vue 3 and JavaScript. A key module is the DocxJson bidirectional converter, which transforms formatted Word content to JSON and back.
The backend is built with Python and uses LangChain + LangGraph for agent design and orchestration. ChatOpenAI-compatible APIs are used for streaming and tool calling. A lightweight PySide6 GUI is also provided for add-in installation and log monitoring.
The core of this project is structured Word document generation. The project defines a JSON schema (conceptually similar to HTML + CSS) to model Word paragraphs and text run styles so agents can understand and generate formatted documents reliably.
Main data structures:
- paragraphs: an array of Word paragraphs (the main editable unit)
- pStyle: paragraph style ID (for example, Heading 1, Heading 2, Body)
- runs: text run array (smallest text unit in this project)
- text: run text
- rStyle: character style ID (for example, bold, red)
- paraIndex: paragraph index used by agents for precise location and editing
- styles: a style dictionary containing all paragraph/character style definitions referenced by style IDs
Compared with common AI writing tools, WenCe AI focuses on:
- Cross-version and cross-platform compatibility: built on mainstream office software with a Copilot-like add-in experience, available on Windows and Linux.
- Native rich-text document editing: agents understand Word structure, can search the web, and can modify both structure and content with formatting awareness.
- Efficient editing with multi-agent collaboration: specialized agents cooperate to produce deeper long-form content.
- Open model ecosystem: users can configure their own API providers and models.
| WPS Add-in UI | Backend Qt UI |
|---|---|
![]() |
![]() |
Example (single-agent mode): in WPS, a user asks for a detailed report on the Iran war. The agent can call web_search to gather references and then call generate_document to produce Word-structured content for rendering in the add-in.
Generated content includes both text and formatting metadata (title/body styles, bold, fonts, indentation, spacing, etc.), which allows rendering as a properly formatted Word document.
Another example: when a user asks, "Expand my internship objective into five points," the agent can first call query_document to locate target paragraphs, then call read_document to fetch the original content, call delete_document to remove old text, and finally call generate_document to produce the rewritten result.
The frontend add-in can render before/after changes with different highlight colors so users can clearly see what was modified.
- WPS Word desktop support
- Windows and Linux support
- Single-agent mode
- Token usage optimization
- Multi-agent mode
- Microsoft Word web add-in support
- Advanced styles (tables, figures, etc.)
The project provides two agent architectures for different task complexity.
The frontend sends user prompts and selected document ranges to the backend.
The backend runs a ReAct-style loop. In each loop, the agent decides whether to call a tool or finish. After tool results return, the agent reasons again and continues until completion.
Core tools:
- read_document: reads a paragraph range and returns structured JSON.
- generate_document: generates structured document JSON for frontend rendering.
- query_document: locates paragraphs by content/style criteria.
- web_search: retrieves online references for grounded writing.
The frontend flow is the same, while the backend uses a planner-driven multi-agent workflow.
- planner agent: decomposes and schedules workflow steps
- research agent: gathers external references
- outline agent: creates document outline
- writer agent: generates document content
- reviewer agent: reviews quality and provides rewrite feedback
- Node v22.12.0
- wpsjs 2.2.3
- Python 3.11.14
- Windows 10/11 or Ubuntu 22.04
# Option A: WPS Word add-in
cd frontend/wps_word_plugin
# Option B: Microsoft Word add-in
# cd frontend/microsoft_word_plugin
pnpm install
pnpm buildcd backend
uv run python main.pyThe project supports LangSmith tracing for agent behavior analysis. See backend/README.md for setup details.
cd backend/deploy
uv run pyinstaller wence.specBuilt binaries are located in backend/deploy/dist.
You can also download packaged artifacts from Releases.
Release artifacts: Release
After download, extract the package and run the executable. Start backend service (wence_word_plugin -> install), open Word, trust the add-in, and start using the system.
You must configure an LLM API provider. The project has been tested with Alibaba Bailian Qwen models.
Current compatibility status (ongoing):
- Qwen 3.5 Plus (stable)
- Qwen3 Max (stable)
- GLM-5 (stable)
- GPT 5.4 (stable)
- MiniMax M2.5 (stable)
- Step 3.5 Flash (stable)
- DeepSeek v3.2 (stable)
- Kimi K2.5 (may enter tool-call loops)
- Qwen Max (unstable tool calls, may fail to generate document)
- Gemini 3.1 Pro
Note: part of development used free credits from Alibaba Bailian and OpenRouter.
Contact: https://cmcblog.netlify.app/about/
Apache License 2.0







