Skip to content

Experiment: sandbox Coder Agents bash tool-call subprocesses with Coder Boundary (per-tool-call egress allowlist) #46

@ausbru87

Description

@ausbru87

UPDATE - read the "REVISED APPROACH" comment first. Verification against the Coder Agents docs and reference/coder showed the correct, surgical target is the Coder Agents tool-call exec chokepoint (agent/agentproc/process.go:124), wrapping only the tool-call subprocess, NOT wrapping the whole workspace agent process. Wrapping the agent process (the original framing below) is fragile and risks the control-plane link; it is kept here as background. The agent-networking allowlist comment is still useful context. Implement the REVISED APPROACH.


Summary

The live firewalled template proves Coder Boundary (agent firewall) works for the Claude Code process: it runs boundary -- claude and enforces a default-deny HTTP(S) allowlist (landjail / Landlock). What it does not cover is the Coder agent process itself and anything else the agent spawns. This experiment is to build an agents-firewalled template variant that wraps the in-workspace Coder agent process with Boundary, and to determine whether that holds (egress is jailed and audited) or breaks the workspace (agent can't reach the control plane, SSH/port-forward/Tasks UI fail, etc.).

Hypothesis: wrapping the Coder agent process under Boundary will firewall all agent-spawned tool egress, but risks breaking the agent's own control-plane connectivity unless the allowlist and transport handling are correct.

This is a research/experiment task. Expect it may prove infeasible as-is; a clear negative result with evidence is an acceptable outcome.

Background: current working baseline (do not regress it)

  • Live template firewalled (org coder, current version weary_ryan08), workspace austenplatform/firewall-test. It is identical to claude-code plus Boundary on the claude process only.
  • Boundary wiring lives in the claude-code module (4.7.3) inputs:
    • enable_boundary = true makes the module launch boundary -- claude (via AgentAPI: agentapi server --type claude -- boundary -- claude ...).
    • use_boundary_directly = true, boundary_version = "latest" (currently boundary v0.9.0, installed at /usr/local/bin/boundary, MIT standalone binary, not the coder boundary subcommand).
    • The module passes no --allow / --jail-type flags, so the allowlist and jail type come only from ~/.config/coder_boundary/config.yaml.
  • The allowlist is rendered from boundary.config.yaml.tftpl (adapted from the Red Hat Summit 2026 demo) via Terraform templatefile(), base64-written by the module pre_install_script. It allows dev.usgov.coderdemo.io (the AI Gateway egress), gitlab.usgov.coderdemo.io, api.anthropic.com, package registries, and test domains; npm is intentionally omitted as the demo DENY.
  • Boundary v0.9.0 dropped config auto-discovery, so the template sets agent env BOUNDARY_CONFIG_FILE=/home/coder/.config/coder_boundary/config.yaml and BOUNDARY_JAIL_TYPE=landjail.
  • The pod runs the Coder agent unwrapped: kubernetes_pod_v1.workspace container command is ["sh","-c", coder_agent.main.init_script]. Boundary only ever wraps claude, never the agent.
  • Note: the firewalled template is currently live-only and not in git. Part of this task is to land the new template (and ideally the existing firewalled baseline) in the repo under coder-templates/.

Goal / acceptance criteria

See the REVISED APPROACH comment for the corrected, authoritative goal and acceptance criteria (sandbox only the Coder Agents tool-call subprocesses; keep the workspace agent healthy). The original goal below is retained as background.

Deliver an agents-firewalled template (new coder-templates/agents-firewalled/, committed) and a written result. Done = one of:

  • SUCCESS: a workspace from the template comes up healthy (agent connected, web terminal + Tasks UI work, git clone to in-cluster GitLab works), AND agent-spawned egress to an off-allowlist domain is denied and audited in /tmp/boundary_logs and surfaces in the Grafana agent-firewall dashboard (uid agent-firewall). Capture a clean DENY and a short recording.
  • DOCUMENTED NEGATIVE: a clear writeup of exactly what breaks (with logs), why (transport/connectivity analysis), and what would be required to make it work (e.g. allowlist additions, transport carve-outs, or a different wrapping point). Restore the cluster to the pre-experiment state.

Either way: do not regress the working firewalled template, and capture the boundary decision-log -> Loki -> dashboard path (the agent-firewall panels previously read zero; confirm allow/deny records actually reach Loki).

Environment and access

. ~/.config/usgov-coderdemo/env
export KUBECONFIG=/home/coder/demoenv-workspace/usgov-coderdemo/kubeconfig
export PATH="$HOME/.local/bin:$PATH"
unset CODER_URL CODER_SESSION_TOKEN CODER_AGENT_TOKEN   # ambient CODER_URL points at dev.coder.com; never target it
export CODER_URL=https://dev.usgov.coderdemo.io
  • coder CLI session at ~/.config/coderv2/ is valid as admin. For curl: TOK=$(cat ~/.config/coderv2/session), header Coder-Session-Token: $TOK.
  • Workspaces run in ns coder-workspaces (pods coder-<workspace-id>). Inspect with kubectl exec.
  • Coder source to verify against: ~/demoenv-workspace/reference/coder (~2.34.1). Reference allowlist + bridge pattern: ~/demoenv-workspace/reference/demo-aigov-rhsummit-2026.

Guardrails

  • Do not modify or rebuild the working firewalled or claude-code templates; add a new agents-firewalled template.
  • Use a single dedicated test workspace; do not disrupt other users' workspaces.
  • Never print secret values. Do not run gitlab-rails (memory constrained).
  • Repo conventions: no emdash/endash/spaced-double-hyphen; commit type(scope): message with a real path scope; body ends with a line Generated by Coder Agents. Open a PR (don't push to main); PR discloses "Generated by Coder Agents, on behalf of @ausbru87". Pull --rebase before push; explicit refspec; no force-push.

Rollback

  • Delete the test workspace and the agents-firewalled template version if the experiment fails: coder templates delete agents-firewalled --org coder (after deleting its workspaces). The firewalled baseline and all other templates remain untouched.

References

  • Live baseline: firewalled template + austenplatform/firewall-test workspace (pod coder-7771f506-...).
  • docs/architecture/agent-firewall-feasibility.md (WS-22), deploy/observability/dashboards-boundary.yaml (uid agent-firewall).
  • Boundary: standalone boundary v0.9.0, MIT, /usr/local/bin/boundary, landjail/Landlock.
  • Coder Agents architecture: coder.com/docs/ai-coder/agents.

Generated by Coder Agents, on behalf of @ausbru87.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions