Skip to content

Add optimizer packages and infrastructure (non-functional)#3516

Closed
therealnb wants to merge 1 commit intomainfrom
optimizer-pr1-packages
Closed

Add optimizer packages and infrastructure (non-functional)#3516
therealnb wants to merge 1 commit intomainfrom
optimizer-pr1-packages

Conversation

@therealnb
Copy link

Summary

This PR adds the core optimizer implementation without wiring it into the server startup. The optimizer provides semantic tool discovery using embeddings to reduce token usage for LLMs working with large toolsets.

This is part 1 of a 3-part PR split from #3440:

  • PR 1 (this): Non-functional packages, deps, config schema
  • PR 2: Integration/wiring into server startup
  • PR 3: Documentation and examples

Changes

New Optimizer Packages (pkg/vmcp/optimizer/internal/)

  • db/: SQLite database with FTS5 full-text search
  • embeddings/: Embedding providers (Ollama, OpenAI-compatible)
  • ingestion/: Tool ingestion service
  • models/: Data models and transport types
  • tokens/: Token counting utilities

Core Optimizer (pkg/vmcp/optimizer/)

  • EmbeddingOptimizer implementation with hybrid search (semantic + FTS)
  • Updated Optimizer interface with Close() and HandleSessionRegistration()
  • Updated dummy_optimizer.go to implement new interface (backward compat)

Config & Schema

  • Config schema updates for optimizer configuration
  • CRD schema updates for VirtualMCPServer optimizer fields
  • OptimizerHandlerProvider interface in adapter package

Tests

  • Comprehensive test coverage for all new packages
  • Updated schema tests for new struct types

Test Plan

  • All optimizer package tests pass
  • All vmcp package tests pass
  • Build succeeds
  • E2E tests (will be enabled in PR 2)

This PR adds the core optimizer implementation without wiring it into
the server startup. The optimizer provides semantic tool discovery
using embeddings to reduce token usage for LLMs working with large toolsets.

Changes include:
- New optimizer internal packages (db, embeddings, ingestion, models, tokens)
- EmbeddingOptimizer implementation with hybrid search (semantic + FTS)
- Updated Optimizer interface with Close() and HandleSessionRegistration()
- OptimizerHandlerProvider interface in adapter package
- Config schema updates for optimizer configuration
- CRD schema updates for VirtualMCPServer optimizer fields
- Updated dummy_optimizer to implement new interface (backward compat)
- Test updates for new schema structure

This is part 1 of a 3-part PR split. Part 2 will wire the optimizer
into server startup. Part 3 will add documentation and examples.
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Large PR Detected

This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.

How to unblock this PR:

Add a section to your PR description with the following format:

## Large PR Justification

[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformation

Alternative:

Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.

See our Contributing Guidelines for more details.


This review will be automatically dismissed once you add the justification section.

@JAORMX
Copy link
Collaborator

JAORMX commented Feb 10, 2026

Hey, question: how are we deploying this SQLite database in Kubernetes?

SQLite's locking model relies on POSIX fcntl locks, which are broken on network filesystems. WAL mode needs shared memory that can't work across network boundaries. Projects like Woodpecker CI and Uptime Kuma have documented instant corruption on NFS-backed PVs. Cloud block storage (EBS, GCE PD) works, but RWO PVCs can get "stuck" on a failed node, leaving the pod in Pending until someone manually intervenes.

Note that the optimizer's SQLite DB is essentially a reconstructable cache. Tools come from discovery, embeddings from the EmbeddingServer. If the DB is lost, we just re-ingest. So, do we even need a persistent volume here? A plain emptyDir with rebuild-on-start might be enough for ~1000 tools.

I'd love to see an explicit decision on the storage backend and a warning against NFS-backed StorageClasses in the docs.

Thoughts?

@jerm-dro
Copy link
Contributor

Note that the optimizer's SQLite DB is essentially a reconstructable cache. Tools come from discovery, embeddings from the EmbeddingServer. If the DB is lost, we just re-ingest. So, do we even need a persistent volume here? A plain emptyDir with rebuild-on-start might be enough for ~1000 tools.

@JAORMX This is the plan. We will not persist the storage. Further discussion here: https://stacklok.slack.com/archives/C09L9QF47EU/p1770676947855589

@jerm-dro
Copy link
Contributor

Closing this PR as it will be replaced by work done in the linked tasks above.

@jerm-dro jerm-dro closed this Feb 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/XL Extra large PR: 1000+ lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments