Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 24 additions & 6 deletions docs/inference/configure.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,9 @@ The configuration consists of two values:
| Provider record | The credential backend OpenShell uses to authenticate with the upstream model host. |
| Model ID | The model to use for generation requests. |

## Step 1: Create a Provider
For a list of tested providers and their base URLs, refer to [Supported Inference Providers](../sandboxes/manage-providers.md#supported-inference-providers).

## Create a Provider

Create a provider that holds the backend credentials you want OpenShell to use.

Expand All @@ -51,7 +53,23 @@ This reads `NVIDIA_API_KEY` from your environment.

::::

::::{tab-item} Local / self-hosted endpoint
::::{tab-item} OpenAI-compatible Provider

Any cloud provider that exposes an OpenAI-compatible API works with the `openai` provider type. You need three values from the provider: the base URL, an API key, and a model name.

```console
$ openshell provider create \
--name my-cloud-provider \
--type openai \
--credential OPENAI_API_KEY=<your_api_key> \
--config OPENAI_BASE_URL=https://api.example.com/v1
```

Replace the base URL and API key with the values from your provider. For supported providers out of the box, refer to [Supported Inference Providers](../sandboxes/manage-providers.md#supported-inference-providers). For other providers, refer to your provider's documentation for the correct base URL, available models, and API key setup.

::::

::::{tab-item} Local Endpoint

```console
$ openshell provider create \
Expand All @@ -77,7 +95,7 @@ This reads `ANTHROPIC_API_KEY` from your environment.

:::::

## Step 2: Set Inference Routing
## Set Inference Routing

Point `inference.local` at that provider and choose the model to use:

Expand All @@ -87,7 +105,7 @@ $ openshell inference set \
--model nvidia/nemotron-3-nano-30b-a3b
```

## Step 3: Verify the Active Config
## Verify the Active Config

Confirm that the provider and model are set correctly:

Expand All @@ -100,7 +118,7 @@ Gateway inference:
Version: 1
```

## Step 4: Update Part of the Config
## Update Part of the Config

Use `update` when you want to change only one field:

Expand All @@ -114,7 +132,7 @@ Or switch providers without repeating the current model:
$ openshell inference update --provider openai-prod
```

## Use It from a Sandbox
## Use the Local Endpoint from a Sandbox

After inference is configured, code inside any sandbox can call `https://inference.local` directly:

Expand Down
2 changes: 1 addition & 1 deletion docs/inference/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ If code calls an external inference host directly, that traffic is evaluated onl
|---|---|
| Credentials | No sandbox API keys needed. Credentials come from the configured provider record. |
| Configuration | One provider and one model define sandbox inference for the active gateway. Every sandbox on that gateway sees the same `inference.local` backend. |
| Provider support | OpenAI, Anthropic, and NVIDIA providers all work through the same endpoint. |
| Provider support | NVIDIA, any OpenAI-compatible provider, and Anthropic all work through the same endpoint. |
| Hot-refresh | OpenShell picks up provider credential changes and inference updates without recreating sandboxes. Changes propagate within about 5 seconds by default. |

## Supported API Patterns
Expand Down
21 changes: 19 additions & 2 deletions docs/sandboxes/manage-providers.md
Original file line number Diff line number Diff line change
Expand Up @@ -132,17 +132,34 @@ The following provider types are supported.
|---|---|---|
| `claude` | `ANTHROPIC_API_KEY`, `CLAUDE_API_KEY` | Claude Code, Anthropic API |
| `codex` | `OPENAI_API_KEY` | OpenAI Codex |
| `opencode` | `OPENCODE_API_KEY`, `OPENROUTER_API_KEY`, `OPENAI_API_KEY` | opencode tool |
| `generic` | User-defined | Any service with custom credentials |
| `github` | `GITHUB_TOKEN`, `GH_TOKEN` | GitHub API, `gh` CLI — refer to {doc}`/tutorials/github-sandbox` |
| `gitlab` | `GITLAB_TOKEN`, `GLAB_TOKEN`, `CI_JOB_TOKEN` | GitLab API, `glab` CLI |
| `nvidia` | `NVIDIA_API_KEY` | NVIDIA API Catalog |
| `generic` | User-defined | Any service with custom credentials |
| `openai` | `OPENAI_API_KEY` | Any OpenAI-compatible endpoint. Set `--config OPENAI_BASE_URL` to point to the provider. Refer to {doc}`/inference/configure`. |
| `opencode` | `OPENCODE_API_KEY`, `OPENROUTER_API_KEY`, `OPENAI_API_KEY` | opencode tool |

:::{tip}
Use the `generic` type for any service not listed above. You define the
environment variable names and values yourself with `--credential`.
:::

## Supported Inference Providers

The following providers have been tested with `inference.local`. Any provider that exposes an OpenAI-compatible API works with the `openai` type. Set `--config OPENAI_BASE_URL` to the provider's base URL and `--credential OPENAI_API_KEY` to your API key.

| Provider | Name | Type | Base URL | API Key Variable |
|---|---|---|---|---|
| NVIDIA API Catalog | `nvidia-prod` | `nvidia` | `https://integrate.api.nvidia.com/v1` | `NVIDIA_API_KEY` |
| Anthropic | `anthropic-prod` | `anthropic` | `https://api.anthropic.com` | `ANTHROPIC_API_KEY` |
| Baseten | `baseten` | `openai` | `https://inference.baseten.co/v1` | `OPENAI_API_KEY` |
| Bitdeer AI | `bitdeer` | `openai` | `https://api-inference.bitdeer.ai/v1` | `OPENAI_API_KEY` |
| Deepinfra | `deepinfra` | `openai` | `https://api.deepinfra.com/v1/openai` | `OPENAI_API_KEY` |
| Ollama (local) | `ollama` | `openai` | `http://host.openshell.internal:11434/v1` | `OPENAI_API_KEY` |
| LM Studio (local) | `lmstudio` | `openai` | `http://host.openshell.internal:1234/v1` | `OPENAI_API_KEY` |

Refer to your provider's documentation for the correct base URL, available models, and API key setup. To configure inference routing, refer to {doc}`/inference/configure`.

## Next Steps

Explore related topics:
Expand Down
Loading