diff --git a/docs/inference/configure.md b/docs/inference/configure.md index 4c86dce5..a0d17498 100644 --- a/docs/inference/configure.md +++ b/docs/inference/configure.md @@ -35,7 +35,9 @@ The configuration consists of two values: | Provider record | The credential backend OpenShell uses to authenticate with the upstream model host. | | Model ID | The model to use for generation requests. | -## Step 1: Create a Provider +For a list of tested providers and their base URLs, refer to [Supported Inference Providers](../sandboxes/manage-providers.md#supported-inference-providers). + +## Create a Provider Create a provider that holds the backend credentials you want OpenShell to use. @@ -51,7 +53,23 @@ This reads `NVIDIA_API_KEY` from your environment. :::: -::::{tab-item} Local / self-hosted endpoint +::::{tab-item} OpenAI-compatible Provider + +Any cloud provider that exposes an OpenAI-compatible API works with the `openai` provider type. You need three values from the provider: the base URL, an API key, and a model name. + +```console +$ openshell provider create \ + --name my-cloud-provider \ + --type openai \ + --credential OPENAI_API_KEY= \ + --config OPENAI_BASE_URL=https://api.example.com/v1 +``` + +Replace the base URL and API key with the values from your provider. For supported providers out of the box, refer to [Supported Inference Providers](../sandboxes/manage-providers.md#supported-inference-providers). For other providers, refer to your provider's documentation for the correct base URL, available models, and API key setup. + +:::: + +::::{tab-item} Local Endpoint ```console $ openshell provider create \ @@ -77,7 +95,7 @@ This reads `ANTHROPIC_API_KEY` from your environment. ::::: -## Step 2: Set Inference Routing +## Set Inference Routing Point `inference.local` at that provider and choose the model to use: @@ -87,7 +105,7 @@ $ openshell inference set \ --model nvidia/nemotron-3-nano-30b-a3b ``` -## Step 3: Verify the Active Config +## Verify the Active Config Confirm that the provider and model are set correctly: @@ -100,7 +118,7 @@ Gateway inference: Version: 1 ``` -## Step 4: Update Part of the Config +## Update Part of the Config Use `update` when you want to change only one field: @@ -114,7 +132,7 @@ Or switch providers without repeating the current model: $ openshell inference update --provider openai-prod ``` -## Use It from a Sandbox +## Use the Local Endpoint from a Sandbox After inference is configured, code inside any sandbox can call `https://inference.local` directly: diff --git a/docs/inference/index.md b/docs/inference/index.md index 3a34affb..92e2b103 100644 --- a/docs/inference/index.md +++ b/docs/inference/index.md @@ -44,7 +44,7 @@ If code calls an external inference host directly, that traffic is evaluated onl |---|---| | Credentials | No sandbox API keys needed. Credentials come from the configured provider record. | | Configuration | One provider and one model define sandbox inference for the active gateway. Every sandbox on that gateway sees the same `inference.local` backend. | -| Provider support | OpenAI, Anthropic, and NVIDIA providers all work through the same endpoint. | +| Provider support | NVIDIA, any OpenAI-compatible provider, and Anthropic all work through the same endpoint. | | Hot-refresh | OpenShell picks up provider credential changes and inference updates without recreating sandboxes. Changes propagate within about 5 seconds by default. | ## Supported API Patterns diff --git a/docs/sandboxes/manage-providers.md b/docs/sandboxes/manage-providers.md index d238e3ee..bad66c59 100644 --- a/docs/sandboxes/manage-providers.md +++ b/docs/sandboxes/manage-providers.md @@ -132,17 +132,34 @@ The following provider types are supported. |---|---|---| | `claude` | `ANTHROPIC_API_KEY`, `CLAUDE_API_KEY` | Claude Code, Anthropic API | | `codex` | `OPENAI_API_KEY` | OpenAI Codex | -| `opencode` | `OPENCODE_API_KEY`, `OPENROUTER_API_KEY`, `OPENAI_API_KEY` | opencode tool | +| `generic` | User-defined | Any service with custom credentials | | `github` | `GITHUB_TOKEN`, `GH_TOKEN` | GitHub API, `gh` CLI — refer to {doc}`/tutorials/github-sandbox` | | `gitlab` | `GITLAB_TOKEN`, `GLAB_TOKEN`, `CI_JOB_TOKEN` | GitLab API, `glab` CLI | | `nvidia` | `NVIDIA_API_KEY` | NVIDIA API Catalog | -| `generic` | User-defined | Any service with custom credentials | +| `openai` | `OPENAI_API_KEY` | Any OpenAI-compatible endpoint. Set `--config OPENAI_BASE_URL` to point to the provider. Refer to {doc}`/inference/configure`. | +| `opencode` | `OPENCODE_API_KEY`, `OPENROUTER_API_KEY`, `OPENAI_API_KEY` | opencode tool | :::{tip} Use the `generic` type for any service not listed above. You define the environment variable names and values yourself with `--credential`. ::: +## Supported Inference Providers + +The following providers have been tested with `inference.local`. Any provider that exposes an OpenAI-compatible API works with the `openai` type. Set `--config OPENAI_BASE_URL` to the provider's base URL and `--credential OPENAI_API_KEY` to your API key. + +| Provider | Name | Type | Base URL | API Key Variable | +|---|---|---|---|---| +| NVIDIA API Catalog | `nvidia-prod` | `nvidia` | `https://integrate.api.nvidia.com/v1` | `NVIDIA_API_KEY` | +| Anthropic | `anthropic-prod` | `anthropic` | `https://api.anthropic.com` | `ANTHROPIC_API_KEY` | +| Baseten | `baseten` | `openai` | `https://inference.baseten.co/v1` | `OPENAI_API_KEY` | +| Bitdeer AI | `bitdeer` | `openai` | `https://api-inference.bitdeer.ai/v1` | `OPENAI_API_KEY` | +| Deepinfra | `deepinfra` | `openai` | `https://api.deepinfra.com/v1/openai` | `OPENAI_API_KEY` | +| Ollama (local) | `ollama` | `openai` | `http://host.openshell.internal:11434/v1` | `OPENAI_API_KEY` | +| LM Studio (local) | `lmstudio` | `openai` | `http://host.openshell.internal:1234/v1` | `OPENAI_API_KEY` | + +Refer to your provider's documentation for the correct base URL, available models, and API key setup. To configure inference routing, refer to {doc}`/inference/configure`. + ## Next Steps Explore related topics: