Skip to content

feat: add LiteLLM as unified LLM provider gateway#5

Closed
RheagalFire wants to merge 3 commits into
henrydaum:mainfrom
RheagalFire:feat/add-litellm-provider
Closed

feat: add LiteLLM as unified LLM provider gateway#5
RheagalFire wants to merge 3 commits into
henrydaum:mainfrom
RheagalFire:feat/add-litellm-provider

Conversation

@RheagalFire
Copy link
Copy Markdown

@RheagalFire RheagalFire commented May 13, 2026

Summary

Adds LiteLLM as a third LLM backend alongside OpenAILLM and LMStudioLLM, enabling access to 100+ LLM providers (OpenAI, Anthropic, Google, AWS Bedrock, Azure, Groq, Mistral, Cohere, etc.) through the LiteLLM SDK.

Changes

File What
plugins/services/service_llm.py New LiteLLM(BaseLLM) class (~150 LOC): invoke(), stream(), chat_with_tools() with lazy litellm import, drop_params=True, image injection, tool call parsing, token usage tracking. Rate limit / auth errors checked by class name before is_context_limit_error() heuristic to prevent false positives. Updated _build_llm_from_profile() factory to handle "LiteLLM" class name.
requirements.txt Added litellm>=1.60,<1.85
tests/test_litellm_llm.py 16 unit tests

Usage

Add a LiteLLM profile in your config:

{
  "llm_profiles": {
    "anthropic/claude-sonnet-4-6": {
      "llm_service_class": "LiteLLM",
      "llm_api_key": "ANTHROPIC_API_KEY",
      "llm_context_size": 200000
    },
    "openai/gpt-4o": {
      "llm_service_class": "LiteLLM",
      "llm_api_key": "OPENAI_API_KEY",
      "llm_context_size": 128000
    }
  },
  "default_llm_profile": "anthropic/claude-sonnet-4-6"
}

LiteLLM reads provider-specific env vars automatically (ANTHROPIC_API_KEY, OPENAI_API_KEY, etc.) or you can pass llm_api_key in the profile config.

Direct usage:

from plugins.services.service_llm import LiteLLM

llm = LiteLLM("anthropic/claude-sonnet-4-6")
llm.load()

result = llm.invoke([{"role": "user", "content": "What is 2+2?"}])
print(result.content)  # "4"

for chunk in llm.stream([{"role": "user", "content": "Say hello"}]):
    print(chunk, end="")

Any model string LiteLLM supports works:

anthropic/claude-sonnet-4-6
openai/gpt-4o
vertex_ai/gemini-2.5-flash
bedrock/anthropic.claude-sonnet-4-6-v2:0
groq/llama-4-scout-17b-16e-instruct
mistral/mistral-large-latest

Integration bug caught during deep-dive and fixed

is_context_limit_error() uses a heuristic that matches "tokens" + "limit" in error text. LiteLLM's rate limit messages (e.g., "Rate limit exceeded. Quota request exceeds the tokens limit.") contain both words, causing rate limits to be misclassified as context overflow. This would trigger the compact-and-retry path in conversation_loop.py instead of surfacing the actual error.

Fix: check exception class name (RateLimitError, AuthenticationError, NotFoundError) before the heuristic, so these deterministic errors short-circuit to provider_error instead of triggering context compaction.

Tests

Unit tests: 16/16 pass

$ python -m pytest tests/test_litellm_llm.py -v
tests/test_litellm_llm.py::test_build_llm_from_profile_litellm PASSED
tests/test_litellm_llm.py::test_chat_with_tools_delegates_to_invoke PASSED
tests/test_litellm_llm.py::test_chat_with_tools_passes_attachments PASSED
tests/test_litellm_llm.py::test_context_limit_raises_provider_error PASSED
tests/test_litellm_llm.py::test_image_capability_inferred PASSED
tests/test_litellm_llm.py::test_invoke_dispatches_to_litellm PASSED
tests/test_litellm_llm.py::test_invoke_forwards_api_key PASSED
tests/test_litellm_llm.py::test_invoke_forwards_base_url PASSED
tests/test_litellm_llm.py::test_invoke_not_loaded_returns_error PASSED
tests/test_litellm_llm.py::test_invoke_returns_prompt_tokens PASSED
tests/test_litellm_llm.py::test_invoke_tool_calls PASSED
tests/test_litellm_llm.py::test_null_content_returns_empty PASSED
tests/test_litellm_llm.py::test_rate_limit_not_misclassified_as_context_limit PASSED
tests/test_litellm_llm.py::test_response_format_passthrough PASSED
tests/test_litellm_llm.py::test_stream_not_loaded_returns_nothing PASSED
tests/test_litellm_llm.py::test_stream_yields_chunks PASSED
========================= 16 passed in 1.04s =========================

Live E2E: Anthropic Claude Sonnet 4-6

=== Test 1: invoke ===
Content: 4
Tokens: 20
Error: None

=== Test 2: stream ===
Stream: Hello|!

PASS

Risk / Compatibility

  • Additive only. OpenAILLM and LMStudioLLM untouched.
  • LiteLLM follows the same BaseLLM interface: invoke(), stream(), chat_with_tools() all return the same LLMResponse type.
  • litellm is a new dependency in requirements.txt. Users who don't use it can ignore it.
  • The _build_llm_from_profile() factory dispatches on "LiteLLM" class name, so existing profiles are unaffected.

@henrydaum
Copy link
Copy Markdown
Owner

Hey I am so sorry I am seeing this message 2 weeks after it was posted. I will get to this ASAP. Thank you for the contribution.

@henrydaum henrydaum closed this May 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants