autobrowse: optional vendor-neutral inbox-provider hook#119
Conversation
Lets an autobrowse loop provision a throwaway inbox so the inner agent can register accounts and complete email verification. A new scripts/inbox.mjs CLI (create / wait-otp / wait-link / latest / release) talks to the browse.sh inbox endpoint, which owns the AgentMail key — the agent only ever sees the address. evaluate.mjs gains --inbox-email, injects the inbox into the system prompt, and allows the agent to shell out to inbox.mjs. SKILL.md documents the opt-in provision/release steps, graduation note (inbox is loop-only), and the 3-concurrent-loop free-tier cap. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Consolidates all inbox-provisioning logic into the autobrowse skill so the feature is self-contained with nothing browse.sh-specific. inbox.mjs now calls api.agentmail.to directly using AGENTMAIL_API_KEY from the env (sweep-on-create and the ab- prefix guard move into the CLI). Browserbase deployments inject a pooled key; regular users provide their own (free at agentmail.to) and get a clear setup error if it's unset. The inner agent still only ever sees the inbox address — the key is read by inbox.mjs and never printed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Hardening found by a live Substack magic-link signup run end-to-end: - wait-link returned an open-tracking pixel (.gif) because it grabbed the first URL anywhere in the body. Now extract <a href> anchors with a reject-list (unsubscribe/mailto/tel/preferences/.gif), which skips img-src pixels; --match matches the href OR the visible link text so "confirm"/"sign in" finds the CTA even when the href is a tracking redirect (browse open follows it). - latest only showed list-summary metadata (the list endpoint omits the body). It now fetches the full single message by id so text/html/links are visible. - partsOf prefers AgentMail's cleaned extracted_text/extracted_html. - evaluate.mjs killed wait-otp/wait-link at the fixed 30s exec cap (ETIMEDOUT on --within 60/90). exec timeout for inbox wait commands is now --within + 15s. Verified end-to-end: signup → wait-link returns the real "Confirm your email" CTA → browse open → signed-in Substack home. Sweep still proven to never touch non-ab- inboxes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… truth) - create now releases the inbox the task already tracks before minting a new one — a re-create within the 1h sweep window otherwise orphaned a live inbox (leaked AND unreachable by release). (#2) - evaluate.mjs resolves the inbox address from .inbox.json (what wait-otp/ wait-link actually poll); --inbox-email is a fallback and a mismatch now warns instead of silently polling a different inbox. (#4) - {{inbox_email}} in task.md is now substituted with the resolved address. (#3) - executeCommand pins inbox.mjs to the run's own --workspace/--task, so a sub-agent can't read or release a sibling task's inbox (parallel runs share a workspace, isolated only by --task). (#5) The 30s exec-timeout issue (#1) was already fixed by execTimeoutFor in 2d091fc. Verified: re-create deletes the prior inbox (no orphan); a divergent --inbox-email warns and the resolved address wins; {{inbox_email}} is replaced; an agent passing a foreign --task is overridden back to its own. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Addressed the Bugbot findings in
Verified each end-to-end against a live AgentMail org: re-create deletes the prior inbox (no orphan); a divergent |
Validation summary — ready for reviewTested at HEAD
No leaked inboxes after any run. Ready to merge. |
|
Reviewed this — the core idea is solid and the security spine is genuinely well done: the AgentMail key never reaches the inner agent, only the throwaway address does. Nice. Three things worth tightening before merge. Framing them simply: 1. Leftover inboxes pile up (the big one). 2. It can grab the wrong verification code. 3. It'll open links sent by strangers. Everything else I found is low/nit (a possibly-dead |
Removes AgentMail from the public skill entirely and replaces the bundled
inbox.mjs with a generic, off-by-default provider contract. autobrowse no longer
ships an email provider or names any vendor; it only knows how to *call* one.
- evaluate.mjs: `--inbox-cmd <path>` / AUTOBROWSE_INBOX_CMD configures an optional
inbox-provider command. Allowlist, exec-timeout, force-scope, and the (now
vendor-neutral) Agent Inbox prompt key off it; all are inert when unset.
Documents the provider contract (create/wait-otp/wait-link/latest/release +
the .inbox.json {email,inbox_id} schema) as the explicit boundary.
- Deleted scripts/inbox.mjs (AgentMail-specific — moves to the internal caller).
- Scrubbed AGENTMAIL_API_KEY/agentmail.to from .env.example, SKILL.md (silent on
the feature), and example-task.md.
Kept generic mechanics: .inbox.json single-source-of-truth, {{inbox_email}}
substitution, --workspace/--task force-scoping, wait-command exec timeout.
Verified: with a throwaway stub provider the hook injects the section,
substitutes the address, and forces scope; with no --inbox-cmd there is no inbox
section and the allowlist is browse-only. `git grep -i agentmail` → no matches.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 0593523. Configure here.
Reworked per team feedback — AgentMail is now fully out of this public repoKeeping AgentMail browse.sh-internal (skills is public), but without forking autobrowse. The inbox capability is now a generic, off-by-default provider hook; the AgentMail implementation + secrets live only in the internal browse.sh repo and are injected into the sandbox at runtime. This PR (public) now:
Verified: with a throwaway stub provider the hook injects the section + substitutes the address + forces scope; with no Pairs with the internal browse.sh PR (provider injection). Divergence stays minimal: one shared autobrowse core; browse.sh owns only a swappable provider script + a few prompt lines. |
|
✅ Re-validated end-to-end on the reworked architecture (full browse.sh sandbox pipeline, local): |
|
Reviewed this alongside the AgentMail provider in browserbase/browse.sh#159. The vendor-neutral split is great — this PR ships zero email-vendor code, just a clean 🟡 MediumThe allowlist lets the inner agent run more than it should. ⚪ Low
✅ What's done wellThe contract design is clean, and the isolation instinct — (Review was AI-assisted.) |
Review fixes (skills#119):
- Inner allowlist now permits only the read subcommands (wait-otp / wait-link /
latest) of the configured --inbox-cmd provider; create/release are
orchestrator-only. Stops the inner agent from killing its own inbox
(`release`) or rewriting .inbox.json (`create`) mid-run so the polled inbox
no longer matches the address in its prompt.
- readInboxState prefers the contract's `email` field (`email || inbox_id`);
it was using inbox_id as the address, which only worked while a provider set
them equal.
- buildInboxSection OTP text is vendor-neutral now ("the provider's default
code match; pass --regex to override") — the digit default lives in the
provider, not this file.
Verified (stub provider, no network): release is BLOCKED while wait-otp is
allowed and scope-forced to the run's task; an email≠inbox_id .inbox.json
resolves to email; {{inbox_email}} substituted.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Thanks @shubh24 — addressed in 🟡 Allowlist let the agent run more than it should. Fixed — the inner allowlist now permits only the read subcommands ( ⚪ ⚪ OTP default text vendor-specific. ⚪ PR description didn't match diff. Rewritten above to describe the actual hook + merge order. ⚪ Link-safety from your earlier comment (open-redirect/internal-IP, substring The Cursor inline threads (orphan-inbox, placeholder-substitution, prompt-vs-state, args-unsandboxed) are stale — all resolved in the rework; the one current Cursor HIGH ( |

Summary
Adds a generic, vendor-neutral, off-by-default inbox-provider hook to autobrowse so it can build signup/login/MFA skills — without shipping any email-vendor code (AgentMail lives only in the internal browse.sh repo). This is the public half.
skills/autobrowse/scripts/evaluate.mjs:--inbox-cmd <path>/AUTOBROWSE_INBOX_CMDconfigures an optional external "inbox provider" command. When unset there is no inbox feature (default). The file documents the provider contract (create/wait-otp/wait-link/latest/release+ the.inbox.json{ email, inbox_id }schema).wait-otp/wait-link/latest);create/releaseare orchestrator-only.forceInboxScopepins the provider to the run's own--workspace/--task(sibling-task isolation in parallel runs)..inbox.json(emailpreferred), with--inbox-emailas a fallback;{{inbox_email}}intask.mdis substituted.git grep -i agentmailis empty.No
inbox.mjsand no SKILL.md changes — the feature is intentionally undocumented/experimental for now (the flag is in--help+ the contract comment).Test plan
--workspace/--task; with no--inbox-cmdthere's no section and the allowlist is browse-only.release/createBLOCKED for the inner agent;wait-otp/wait-link/latestallowed.readInboxStateresolves toemailwhenemail≠inbox_id.🤖 Generated with Claude Code