Word Agent

1. Overview

Word Agent is an AI-assisted writing system (single-agent and multi-agent): WenCe AI. After installing the add-in in office suites (such as WPS and Microsoft Word), users can interact with AI through natural language to get writing suggestions, content generation, and structure optimization.

WenCe AI (Word Agent): strategy-driven writing, smarter expression.

The backend is built with FastAPI. The frontend add-ins communicate with the backend through streaming interfaces so users can see LLM outputs in real time and get a seamless writing-assistant experience.

The frontend is built with Vue 3 and JavaScript. A key module is the DocxJson bidirectional converter, which transforms formatted Word content to JSON and back.

The backend is built with Python and uses LangChain + LangGraph for agent design and orchestration. ChatOpenAI-compatible APIs are used for streaming and tool calling. A lightweight PySide6 GUI is also provided for add-in installation and log monitoring.

The core of this project is structured Word document generation. The project defines a JSON schema (conceptually similar to HTML + CSS) to model Word paragraphs and text run styles so agents can understand and generate formatted documents reliably.

Main data structures:

paragraphs: an array of Word paragraphs (the main editable unit)
- pStyle: paragraph style ID (for example, Heading 1, Heading 2, Body)
- runs: text run array (smallest text unit in this project)
  - text: run text
  - rStyle: character style ID (for example, bold, red)
- paraIndex: paragraph index used by agents for precise location and editing
styles: a style dictionary containing all paragraph/character style definitions referenced by style IDs

Compared with common AI writing tools, WenCe AI focuses on:

Cross-version and cross-platform compatibility: built on mainstream office software with a Copilot-like add-in experience, available on Windows and Linux.
Native rich-text document editing: agents understand Word structure, can search the web, and can modify both structure and content with formatting awareness.
Efficient editing with multi-agent collaboration: specialized agents cooperate to produce deeper long-form content.
Open model ecosystem: users can configure their own API providers and models.

2. Preview

WPS Add-in UI	Backend Qt UI

Example (single-agent mode): in WPS, a user asks for a detailed report on the Iran war. The agent can call web_search to gather references and then call generate_document to produce Word-structured content for rendering in the add-in.

Generated content includes both text and formatting metadata (title/body styles, bold, fonts, indentation, spacing, etc.), which allows rendering as a properly formatted Word document.

Another example: when a user asks, "Expand my internship objective into five points," the agent can first call query_document to locate target paragraphs, then call read_document to fetch the original content, call delete_document to remove old text, and finally call generate_document to produce the rewritten result.

The frontend add-in can render before/after changes with different highlight colors so users can clearly see what was modified.

3. Roadmap

4. Architecture

The project provides two agent architectures for different task complexity.

4.1 Single-agent loop

The frontend sends user prompts and selected document ranges to the backend.

The backend runs a ReAct-style loop. In each loop, the agent decides whether to call a tool or finish. After tool results return, the agent reasons again and continues until completion.

Core tools:

read_document: reads a paragraph range and returns structured JSON.
generate_document: generates structured document JSON for frontend rendering.
query_document: locates paragraphs by content/style criteria.
web_search: retrieves online references for grounded writing.

4.2 Multi-agent workflow

The frontend flow is the same, while the backend uses a planner-driven multi-agent workflow.

planner agent: decomposes and schedules workflow steps
research agent: gathers external references
outline agent: creates document outline
writer agent: generates document content
reviewer agent: reviews quality and provides rewrite feedback

5. Quick Start

5.1 Environment

Node v22.12.0
wpsjs 2.2.3
Python 3.11.14
Windows 10/11 or Ubuntu 22.04

5.2 Build frontend add-in

# Option A: WPS Word add-in
cd frontend/wps_word_plugin
# Option B: Microsoft Word add-in
# cd frontend/microsoft_word_plugin
pnpm install
pnpm build

5.3 Run backend service

cd backend
uv run python main.py

5.4 LangSmith tracing

The project supports LangSmith tracing for agent behavior analysis. See backend/README.md for setup details.

5.5 Packaging

cd backend/deploy
uv run pyinstaller wence.spec

Built binaries are located in backend/deploy/dist.

You can also download packaged artifacts from Releases.

5.6 Download

Release artifacts: Release

5.7 Run packaged app

After download, extract the package and run the executable. Start backend service (wence_word_plugin -> install), open Word, trust the add-in, and start using the system.

You must configure an LLM API provider. The project has been tested with Alibaba Bailian Qwen models.

6. LLM API Status

Current compatibility status (ongoing):

Note: part of development used free credits from Alibaba Bailian and OpenRouter.

7. Author

Contact: https://cmcblog.netlify.app/about/

8. License

Apache License 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 93 Commits
.github/workflows		.github/workflows
backend		backend
docs		docs
frontend		frontend
web		web
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
README.zh-CN.md		README.zh-CN.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Word Agent

1. Overview

2. Preview

3. Roadmap

4. Architecture

4.1 Single-agent loop

4.2 Multi-agent workflow

5. Quick Start

5.1 Environment

5.2 Build frontend add-in

5.3 Run backend service

5.4 LangSmith tracing

5.5 Packaging

5.6 Download

5.7 Run packaged app

6. LLM API Status

7. Author

8. License

About

Uh oh!

Releases 4

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

Word Agent

1. Overview

2. Preview

3. Roadmap

4. Architecture

4.1 Single-agent loop

4.2 Multi-agent workflow

5. Quick Start

5.1 Environment

5.2 Build frontend add-in

5.3 Run backend service

5.4 LangSmith tracing

5.5 Packaging

5.6 Download

5.7 Run packaged app

6. LLM API Status

7. Author

8. License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 4

Contributors 1

Languages