Add compact plugin for auto context compression by GoDiao · Pull Request #305 · algorithmicsuperintelligence/optillm

GoDiao · 2026-05-06T17:19:42Z

Summary

Closes #249

Adds a compact plugin that automatically compresses conversation context when it exceeds a configurable token budget, inspired by Claude Code's /compact mechanism.

How it works

Token budget check - Estimates token count of the conversation; if below threshold, passes through unchanged (zero overhead)
Context window detection - Tries to get context window from provider's /models endpoint (e.g. Ollama, vLLM expose context_length), then falls back to config
Split regions - Older turns are compressed via LLM into a structured summary; recent N turns are preserved verbatim
Structured summary - The LLM produces a summary with Scope, Key decisions, User preferences, Pending work, Key files, and Context
Graceful fallback - If LLM compression fails, returns original query unchanged

Composability

Works with the & operator for pipeline composition:

compact&moa    -> compact first, then Mixture of Agents
compact&bon    -> compact first, then Best of N

Configuration

Config	Env Var	Default	Description
`compact_context_window`	`COMPACT_CONTEXT_WINDOW`	128000	Max context tokens
`compact_threshold`	`COMPACT_THRESHOLD`	0.75	Trigger ratio (0.0-1.0)
`compact_keep_recent`	`COMPACT_KEEP_RECENT`	4	Turns to preserve verbatim

Tests

32 unit/integration tests in tests/test_compact_plugin.py
Plugin registration test added to tests/test_plugins.py

50 tests passed (18 plugin + 32 compact)

End-to-end verification

Tested with a live API (MiMo):

31-turn conversation (10,119 tokens) compressed to 580 tokens (94% reduction)
29 older turns summarized, 2 recent turns preserved verbatim
Context window detection: provider /models returned 400, gracefully fell back to config default

Test plan

All existing plugin tests pass (18/18)
Unit tests for token estimation, conversation parsing, config priority
Integration tests for passthrough, compression, LLM failure fallback
Edge cases: summary tag extraction, malformed env vars, embedded tags
Manual end-to-end test with live API - plugin loads and executes correctly

…perintelligence#249)

CLAassistant · 2026-05-06T17:19:49Z

All committers have signed the CLA.

CLAassistant · 2026-05-06T17:19:49Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

GoDiao · 2026-05-07T04:19:23Z

Hi! The CI failures in conversation-logging-tests and integration-tests appear to be unrelated to this PR. They're failing due to HuggingFace authentication issues when trying to access the gated model google/gemma-3-270m-it:

openai.InternalServerError: Error code: 500 - You are trying to access a gated repo.
Make sure to have access to it at https://huggingface.co/google/gemma-3-270m-it.

The unit-tests and test_plugins checks passed successfully. These failures seem to be a CI environment issue (missing HuggingFace token) rather than anything introduced by this change. Happy to fix if there's anything on our side though!

…3.15 google/gemma-3-270m-it became gated, breaking integration-tests and conversation-logging-tests in CI. Swap in Qwen/Qwen2.5-Coder-0.5B-Instruct which is public, instruction-tuned, has a chat_template, no thinking mode, and works with the existing transformers pin. Verified locally that test_json_plugin.py (9/9), test_n_parameter.py, test_reasoning_integration.py (8/8), and test_conversation_logging_server.py (10/10) now pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Add compact plugin for auto context compression (closes algorithmicsu…

5c07f46

…perintelligence#249)

codelion merged commit df018d6 into algorithmicsuperintelligence:main May 7, 2026
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add compact plugin for auto context compression#305

Add compact plugin for auto context compression#305
codelion merged 2 commits intoalgorithmicsuperintelligence:mainfrom
GoDiao:feature/compact-plugin

GoDiao commented May 6, 2026

Uh oh!

CLAassistant commented May 6, 2026 •

edited

Loading

Uh oh!

CLAassistant commented May 6, 2026

Uh oh!

GoDiao commented May 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

GoDiao commented May 6, 2026

Summary

How it works

Composability

Configuration

Tests

End-to-end verification

Test plan

Uh oh!

CLAassistant commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CLAassistant commented May 6, 2026

Uh oh!

GoDiao commented May 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

CLAassistant commented May 6, 2026 •

edited

Loading