diff --git a/docs.json b/docs.json index e9fef12d..9f389259 100644 --- a/docs.json +++ b/docs.json @@ -172,7 +172,8 @@ }, "openhands/usage/advanced/configuration-options", "openhands/usage/advanced/custom-sandbox-guide", - "openhands/usage/advanced/search-engine-setup" + "openhands/usage/advanced/search-engine-setup", + "openhands/usage/developers/v1-local-agent-server" ] } ] @@ -244,6 +245,7 @@ "pages": [ "sdk/guides/agent-server/overview", "sdk/guides/agent-server/local-server", + "sdk/guides/agent-server/app-integration", "sdk/guides/agent-server/docker-sandbox", "sdk/guides/agent-server/api-sandbox", "sdk/guides/agent-server/cloud-workspace", diff --git a/openhands/usage/developers/v1-local-agent-server.mdx b/openhands/usage/developers/v1-local-agent-server.mdx new file mode 100644 index 00000000..8477e659 --- /dev/null +++ b/openhands/usage/developers/v1-local-agent-server.mdx @@ -0,0 +1,92 @@ +--- +title: Run OpenHands App-Server with Local SDK Agent-Server +description: Run the OpenHands V1 app-server backed by a purely local SDK agent-server, no Docker or remote runtime required. +--- + +This guide shows how to run the OpenHands V1 app-server while spawning a fully local agent-server process (no Docker, no remote runtime). + + +Applies to the V1 app-server path enabled under /api/v1. + + +## Prerequisites +- Python 3.12, Node 22.x, Poetry 1.8+ +- OpenHands repository checked out +- software-agent-sdk is an install-time dependency of OpenHands (see pyproject.toml), and provides the openhands.agent_server entrypoint + +## Where V1 Lives (Execution Path Overview) +OpenHands V1 exposes its API under /api/v1 and keeps the V1 server logic in app_server/: +- The FastAPI app includes V1 only when ENABLE_V1 != "0". See [server/app.py](https://github.com/OpenHands/OpenHands/blob/main/openhands/server/app.py#L90-L98). +- The V1 router aggregates app_server endpoints. See [v1_router.py](https://github.com/OpenHands/OpenHands/blob/main/openhands/app_server/v1_router.py). +- For a local runtime, the app-server launches a separate agent-server process (from software-agent-sdk) on a free localhost port via the process sandbox: [process_sandbox_service](https://github.com/OpenHands/OpenHands/blob/main/openhands/app_server/sandbox/process_sandbox_service.py#L113-L154). It executes python -m openhands.agent_server and probes /alive. +- This selection comes from env→config wiring. When RUNTIME is local or process, injectors select the process-based sandbox/spec: see [app_server/config.py](https://github.com/OpenHands/OpenHands/blob/main/openhands/app_server/config.py#L102-L160) and [process_sandbox_spec_service.py](https://github.com/OpenHands/OpenHands/blob/main/openhands/app_server/sandbox/process_sandbox_spec_service.py#L17-L34). +- On the SDK side, the agent-server exposes /alive and /health (and /api/*). See [server_details_router.py](https://github.com/OpenHands/software-agent-sdk/blob/main/openhands-agent-server/openhands/agent_server/server_details_router.py#L27-L41) and bootstrap in [agent_server/__main__.py](https://github.com/OpenHands/software-agent-sdk/blob/main/openhands-agent-server/openhands/agent_server/__main__.py#L1-L40). +- The app-server talks to the agent-server using AsyncRemoteWorkspace and the agent-server REST/WebSocket API. See [AsyncRemoteWorkspace](https://github.com/OpenHands/software-agent-sdk/blob/main/openhands-sdk/openhands/sdk/workspace/remote/async_remote_workspace.py) and where the app-server builds conversation requests in _build_start_conversation_request_for_user ([source](https://github.com/OpenHands/OpenHands/blob/main/openhands/app_server/app_conversation/live_status_app_conversation_service.py#L929-L1014)). + +## Run App-Server with a Local Agent-Server + +### 1) Build OpenHands and Enable V1 +By default V1 is enabled (unless ENABLE_V1=0). Ensure you’re on Python 3.12 and have Poetry and Node installed, then: + +```bash Build +export INSTALL_DOCKER=0 # No Docker for this guide +export RUNTIME=process # Force process-based sandbox (spawns sub-process agent-server) +# optional: export ENABLE_V1=1 # V1 is enabled by default; set explicitly if you disabled it elsewhere + +make build +``` + +### 2) Run Backend and Frontend +For local development, either run everything together or separately. + +```bash Full App (Backend + Frontend) +# Adjust ports/hosts if you’re on a remote machine. +make run FRONTEND_PORT=12000 FRONTEND_HOST=0.0.0.0 BACKEND_HOST=0.0.0.0 +``` + +or start servers individually: + +```bash Separate Servers +make start-backend +make start-frontend +``` + +### 3) Start a V1 Conversation (Spawns Local Agent-Server) +From the UI (Chat → Start) or via the V1 endpoints under /api/v1, create a new conversation. The app-server will: +- Start a sandbox using the process sandbox service, which launches a local agent-server process via python -m openhands.agent_server +- Poll the agent-server /alive endpoint until it reports status ok +- Store the agent-server URL and a per-sandbox session API key, then reuse it for subsequent operations + +### 4) Verify the Local Agent-Server +You can list sandboxes via V1 API and find the exposed agent-server URL; the name in the response is "agent-server". The health checks are: +- GET `{agent_server_url}/alive` → `{"status":"ok"}` (see [server_details_router.py](https://github.com/OpenHands/software-agent-sdk/blob/main/openhands-agent-server/openhands/agent_server/server_details_router.py#L27-L41)) +- GET `{agent_server_url}/server_info` for uptime/idle info + +### 5) How the App-Server Talks to the Agent-Server +When reading/writing files or executing commands in the conversation’s workspace, the app-server uses AsyncRemoteWorkspace that targets the agent-server. Example usage: [app_conversation_router.read_conversation_file](https://github.com/OpenHands/OpenHands/blob/main/openhands/app_server/app_conversation/app_conversation_router.py#L335-L409) and throughout [live_status_app_conversation_service](https://github.com/OpenHands/OpenHands/blob/main/openhands/app_server/app_conversation/live_status_app_conversation_service.py). + +## Customize the Agent Loaded by the GUI + +For customization paths (skills/MCP, agent presets, or defining a new Agent type the app can instantiate), see: +- /sdk/guides/agent-server/app-integration +- /sdk/guides/agent-custom +- /sdk/guides/custom-tools + +## Troubleshooting +- If no agent-server starts when you create a V1 conversation, ensure: + - ENABLE_V1 is not "0" (V1 router included): [server/app.py](https://github.com/OpenHands/OpenHands/blob/main/openhands/server/app.py#L90-L98) + - RUNTIME = local or process so the process sandbox is selected: [config_from_env](https://github.com/OpenHands/OpenHands/blob/main/openhands/app_server/config.py#L102-L160) + - The spawned server is probed via /alive: [process_sandbox_service](https://github.com/OpenHands/OpenHands/blob/main/openhands/app_server/sandbox/process_sandbox_service.py#L155-L200) +- To inspect the agent-server API, open its Swagger at `{agent_server_url}/docs` (the server is started via uvicorn in [__main__.py](https://github.com/OpenHands/software-agent-sdk/blob/main/openhands-agent-server/openhands/agent_server/__main__.py#L1-L40)). + +## References +- OpenHands V1 app-server boot and routing: [server/app.py](https://github.com/OpenHands/OpenHands/blob/main/openhands/server/app.py#L90-L98), [v1_router.py](https://github.com/OpenHands/OpenHands/blob/main/openhands/app_server/v1_router.py) +- Sandbox process service (spawns local agent-server): [process_sandbox_service.py](https://github.com/OpenHands/OpenHands/blob/main/openhands/app_server/sandbox/process_sandbox_service.py#L113-L200) +- Env→injector selection for local runtime: [config.py](https://github.com/OpenHands/OpenHands/blob/main/openhands/app_server/config.py#L102-L160) +- Default command and env to start agent-server: [process_sandbox_spec_service.py](https://github.com/OpenHands/OpenHands/blob/main/openhands/app_server/sandbox/process_sandbox_spec_service.py#L17-L34) +- Agent-server app and health endpoints: [__main__.py](https://github.com/OpenHands/software-agent-sdk/blob/main/openhands-agent-server/openhands/agent_server/__main__.py#L1-L40), [server_details_router.py](https://github.com/OpenHands/software-agent-sdk/blob/main/openhands-agent-server/openhands/agent_server/server_details_router.py#L27-L41) +- Agent-server conversation API: [conversation_router.py](https://github.com/OpenHands/software-agent-sdk/blob/main/openhands-agent-server/openhands/agent_server/conversation_router.py) +- AsyncRemoteWorkspace client: [async_remote_workspace.py](https://github.com/OpenHands/software-agent-sdk/blob/main/openhands-sdk/openhands/sdk/workspace/remote/async_remote_workspace.py) +- V1 agent composition in app-server: [live_status_app_conversation_service.py](https://github.com/OpenHands/OpenHands/blob/main/openhands/app_server/app_conversation/live_status_app_conversation_service.py) +- Skills loading and merge points: [app_conversation_service_base.py](https://github.com/OpenHands/OpenHands/blob/main/openhands/app_server/app_conversation/app_conversation_service_base.py#L56-L115), [skill_loader.py](https://github.com/OpenHands/OpenHands/blob/main/openhands/app_server/app_conversation/skill_loader.py) +- SDK documentation on agent-server modes: [/sdk/guides/agent-server/overview](/sdk/guides/agent-server/overview), [/sdk/guides/agent-server/local-server](/sdk/guides/agent-server/local-server) diff --git a/sdk/guides/agent-server/app-integration.mdx b/sdk/guides/agent-server/app-integration.mdx new file mode 100644 index 00000000..b0d3d1eb --- /dev/null +++ b/sdk/guides/agent-server/app-integration.mdx @@ -0,0 +1,64 @@ +--- +title: Integrate SDK Agents with the OpenHands App +description: How the OpenHands app uses the SDK Agent Server and where to customize the agent that the GUI loads +--- + +This guide explains how the OpenHands app (V1) composes an SDK Agent when you start a conversation in the GUI, and where you can customize behavior using the SDK. + + +Best read alongside:
+- Agent Server Overview
+- Run a Local Agent Server
+- Creating Custom Agent +
+ +## How the App Loads an SDK Agent + +When you start a V1 conversation from the GUI, the app-server: +- Launches an SDK Agent Server (locally or via the configured host) +- Builds a StartConversationRequest that includes an Agent specification +- Sends that request to the Agent Server's conversation API + +Key references: +- App builds the V1 router only if V1 is enabled: server/app.py +- Conversation request assembly: live_status_app_conversation_service.py +- Agent Server endpoints: conversation_router.py, server_details_router.py + +## Customization Paths + +### 1) Skills and MCP (No-Code) +The app merges skills and MCP server config into the agent context: +- User skills from `~/.openhands/skills` and `~/.openhands/microagents/` +- Repo/org skills resolved from your working repository +- MCP servers from settings and defaults (OpenHands + Tavily) + +This path lets you steer agent behavior without changing code. + +### 2) Agent Type Selection (Low-Code) +The app selects an Agent preset based on AgentType (DEFAULT or PLAN). You can: +- Toggle the AgentType in the Start request (UI or API) +- Adjust LLM and MCP settings in the UI + +See also SDK presets for default and planning agents. + +### 3) Define a New Agent Type in the App (Advanced) +If you need the GUI to instantiate a different agent layout (custom tools, system prompt, etc.): +1. Add a new enum value to AgentType in the app +2. Extend the builder to construct your custom Agent for that type +3. Optionally expose it in the frontend for selection + +This keeps the App as the source of truth for Agent construction while leveraging SDK components. + +## Validate with the SDK First + +Before wiring into the App, validate your design directly with the SDK: +- Run the Local Agent Server guide to test endpoints +- Use the Creating Custom Agent guide to build presets and behaviors + +## See Also + +- /sdk/guides/agent-custom +- /sdk/guides/custom-tools +- /sdk/guides/mcp +- /sdk/guides/agent-server/overview +- /sdk/guides/agent-server/local-server