feat: add LiteLLM as unified LLM provider gateway by RheagalFire · Pull Request #5 · henrydaum/second-brain

RheagalFire · 2026-05-13T22:01:09Z

Summary

Adds LiteLLM as a third LLM backend alongside OpenAILLM and LMStudioLLM, enabling access to 100+ LLM providers (OpenAI, Anthropic, Google, AWS Bedrock, Azure, Groq, Mistral, Cohere, etc.) through the LiteLLM SDK.

Changes

File	What
`plugins/services/service_llm.py`	New `LiteLLM(BaseLLM)` class (~150 LOC): `invoke()`, `stream()`, `chat_with_tools()` with lazy litellm import, `drop_params=True`, image injection, tool call parsing, token usage tracking. Rate limit / auth errors checked by class name before `is_context_limit_error()` heuristic to prevent false positives. Updated `_build_llm_from_profile()` factory to handle `"LiteLLM"` class name.
`requirements.txt`	Added `litellm>=1.60,<1.85`
`tests/test_litellm_llm.py`	16 unit tests

Usage

Add a LiteLLM profile in your config:

{
  "llm_profiles": {
    "anthropic/claude-sonnet-4-6": {
      "llm_service_class": "LiteLLM",
      "llm_api_key": "ANTHROPIC_API_KEY",
      "llm_context_size": 200000
    },
    "openai/gpt-4o": {
      "llm_service_class": "LiteLLM",
      "llm_api_key": "OPENAI_API_KEY",
      "llm_context_size": 128000
    }
  },
  "default_llm_profile": "anthropic/claude-sonnet-4-6"
}

LiteLLM reads provider-specific env vars automatically (ANTHROPIC_API_KEY, OPENAI_API_KEY, etc.) or you can pass llm_api_key in the profile config.

Direct usage:

from plugins.services.service_llm import LiteLLM

llm = LiteLLM("anthropic/claude-sonnet-4-6")
llm.load()

result = llm.invoke([{"role": "user", "content": "What is 2+2?"}])
print(result.content)  # "4"

for chunk in llm.stream([{"role": "user", "content": "Say hello"}]):
    print(chunk, end="")

Any model string LiteLLM supports works:

anthropic/claude-sonnet-4-6
openai/gpt-4o
vertex_ai/gemini-2.5-flash
bedrock/anthropic.claude-sonnet-4-6-v2:0
groq/llama-4-scout-17b-16e-instruct
mistral/mistral-large-latest

Integration bug caught during deep-dive and fixed

is_context_limit_error() uses a heuristic that matches "tokens" + "limit" in error text. LiteLLM's rate limit messages (e.g., "Rate limit exceeded. Quota request exceeds the tokens limit.") contain both words, causing rate limits to be misclassified as context overflow. This would trigger the compact-and-retry path in conversation_loop.py instead of surfacing the actual error.

Fix: check exception class name (RateLimitError, AuthenticationError, NotFoundError) before the heuristic, so these deterministic errors short-circuit to provider_error instead of triggering context compaction.

Tests

Unit tests: 16/16 pass

$ python -m pytest tests/test_litellm_llm.py -v
tests/test_litellm_llm.py::test_build_llm_from_profile_litellm PASSED
tests/test_litellm_llm.py::test_chat_with_tools_delegates_to_invoke PASSED
tests/test_litellm_llm.py::test_chat_with_tools_passes_attachments PASSED
tests/test_litellm_llm.py::test_context_limit_raises_provider_error PASSED
tests/test_litellm_llm.py::test_image_capability_inferred PASSED
tests/test_litellm_llm.py::test_invoke_dispatches_to_litellm PASSED
tests/test_litellm_llm.py::test_invoke_forwards_api_key PASSED
tests/test_litellm_llm.py::test_invoke_forwards_base_url PASSED
tests/test_litellm_llm.py::test_invoke_not_loaded_returns_error PASSED
tests/test_litellm_llm.py::test_invoke_returns_prompt_tokens PASSED
tests/test_litellm_llm.py::test_invoke_tool_calls PASSED
tests/test_litellm_llm.py::test_null_content_returns_empty PASSED
tests/test_litellm_llm.py::test_rate_limit_not_misclassified_as_context_limit PASSED
tests/test_litellm_llm.py::test_response_format_passthrough PASSED
tests/test_litellm_llm.py::test_stream_not_loaded_returns_nothing PASSED
tests/test_litellm_llm.py::test_stream_yields_chunks PASSED
========================= 16 passed in 1.04s =========================

Live E2E: Anthropic Claude Sonnet 4-6

=== Test 1: invoke ===
Content: 4
Tokens: 20
Error: None

=== Test 2: stream ===
Stream: Hello|!

PASS

Risk / Compatibility

Additive only. OpenAILLM and LMStudioLLM untouched.
LiteLLM follows the same BaseLLM interface: invoke(), stream(), chat_with_tools() all return the same LLMResponse type.
litellm is a new dependency in requirements.txt. Users who don't use it can ignore it.
The _build_llm_from_profile() factory dispatches on "LiteLLM" class name, so existing profiles are unaffected.

henrydaum · 2026-05-27T01:35:15Z

Hey I am so sorry I am seeing this message 2 weeks after it was posted. I will get to this ASAP. Thank you for the contribution.

RheagalFire added 3 commits May 14, 2026 03:24

feat: add LiteLLM as unified LLM provider gateway

8d9abcf

fix: prevent rate limit errors from being misclassified as context limit

1f66636

rename LiteLLMLLM to LiteLLM

6a629f6

henrydaum closed this May 29, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add LiteLLM as unified LLM provider gateway#5

feat: add LiteLLM as unified LLM provider gateway#5
RheagalFire wants to merge 3 commits into
henrydaum:mainfrom
RheagalFire:feat/add-litellm-provider

RheagalFire commented May 13, 2026 •

edited

Loading

Uh oh!

henrydaum commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

RheagalFire commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Usage

Integration bug caught during deep-dive and fixed

Tests

Unit tests: 16/16 pass

Live E2E: Anthropic Claude Sonnet 4-6

Risk / Compatibility

Uh oh!

henrydaum commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

RheagalFire commented May 13, 2026 •

edited

Loading