diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 0000000..3258672 --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,42 @@ +# Contributing + +This repository is an incubation space for the +[Tool Annotations Interest Group](https://modelcontextprotocol.io/community/tool-annotations/charter). +We welcome proposals, schema changes, and reference implementations that +inform a future Extensions Track SEP. + +## What lives here + +- **Specification drafts** — `specification/draft/.mdx`, + one file per extension, written in the same RFC-2119 style as the core MCP + specification (per [SEP-2133](https://github.com/modelcontextprotocol/modelcontextprotocol/blob/main/seps/2133-extensions.md)). +- **Decision records** — `docs/decisions.md`. Append, do not rewrite. +- **Open questions** — `docs/open-questions.md`. + +## What does *not* live here + +- Implementation code. Reference implementations live in their own + repositories and are linked from the relevant `specification/draft/*.mdx`. +- Binding specification changes. Those are made through the + [SEP process](https://modelcontextprotocol.io/community/sep-guidelines). + +## Proposing a change to an existing extension + +1. Open a PR against `specification/draft/.mdx`. +2. Update the **Status** and **Changelog** sections in the frontmatter. +3. If the change is breaking (per the SEP-2133 definition), use a new + extension identifier and a new file. +4. Append an entry to `docs/decisions.md` if the change reflects a design + decision worth preserving. + +## Proposing a new extension + +1. Read [SEP-2133, "Experimental Extensions"](https://github.com/modelcontextprotocol/modelcontextprotocol/blob/main/seps/2133-extensions.md#experimental-extensions). +2. Open a discussion or PR proposing the identifier and scope. +3. On acceptance, add `specification/draft/.mdx` using the + frontmatter from an existing draft as a template. + +## Code of conduct + +This repository follows the +[MCP Code of Conduct](https://github.com/modelcontextprotocol/.github/blob/main/CODE_OF_CONDUCT.md). diff --git a/LICENSE b/LICENSE new file mode 100644 index 0000000..b024e54 --- /dev/null +++ b/LICENSE @@ -0,0 +1,11 @@ +Apache License +Version 2.0, January 2004 +http://www.apache.org/licenses/ + +Per SEP-2133, official MCP extensions are required to be available under the +Apache 2.0 license. Experimental extensions in this repository follow the +same convention so that contributions made here can flow into a future +official extension repository without re-licensing. + +The full text of the Apache License, Version 2.0 is available at: +https://www.apache.org/licenses/LICENSE-2.0 diff --git a/MAINTAINERS.md b/MAINTAINERS.md new file mode 100644 index 0000000..a72eb4f --- /dev/null +++ b/MAINTAINERS.md @@ -0,0 +1,20 @@ +# Maintainers + +This repository is governed by the +[Tool Annotations Interest Group](https://modelcontextprotocol.io/community/tool-annotations/charter). +Day-to-day repository maintenance follows the IG's facilitator structure. + +| Role | Name | Organization | GitHub | +| ----------- | -------------- | ------------ | ---------------------------------------------------- | +| Facilitator | Sam Morrow | GitHub | [@SamMorrowDrums](https://github.com/SamMorrowDrums) | +| Facilitator | Robert Reichel | OpenAI | [@rreichel3](https://github.com/rreichel3) | + +Per [SEP-2133](https://github.com/modelcontextprotocol/modelcontextprotocol/blob/main/seps/2133-extensions.md#experimental-extensions), +core maintainers of the modelcontextprotocol organization retain oversight, +including the ability to archive or remove this repository. + +## Per-extension maintainers + +Individual extensions may nominate additional maintainers responsible for +their specification draft and reference implementations. List them in the +`specification/draft/.mdx` frontmatter. diff --git a/README.md b/README.md index dc87aeb..8c07123 100644 --- a/README.md +++ b/README.md @@ -1 +1,96 @@ -# experimental-ext-tool-annotations +# Tool Annotations Interest Group — Experimental Extensions + +> ⚠️ **Experimental** — This repository is an incubation space for the +> [Tool Annotations Interest Group](https://modelcontextprotocol.io/community/tool-annotations/charter). +> Contents are exploratory drafts intended to feed future Extensions Track SEPs +> ([SEP-2133](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2133)). +> They do not represent official MCP specifications or recommendations. + +**Charter:** [modelcontextprotocol.io/community/tool-annotations/charter](https://modelcontextprotocol.io/community/tool-annotations/charter) +**Discord:** [#tool-annotations-ig](https://discord.com/channels/1358869848138059966/1482836798517543073) +**Open work:** [Pull requests](https://github.com/modelcontextprotocol/experimental-ext-tool-annotations/pulls) + +## Why split the work? + +[SEP-1913 (Trust and Sensitivity Annotations)](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913) +bundles four concerns that have proven hard to evaluate as a single unit: a +client-facing trust taxonomy, action-security metadata for tool I/O, a +malicious-activity signal, and propagation rules across session boundaries. + +The sponsor, [@localden](https://github.com/localden), asked the central +question directly in review: the SEP "adds a few schema modifications and a +thorny array-or-scalar polymorphism on enum fields. If the taxonomy turns out +to be wrong, I worry that we can't remove it or easily modify it. Can we do a +potential narrower first cut?" The subsequent design discussion converged on a +layered answer: a small, stable annotation surface on the wire, with richer +evidence kept out-of-band and referenced by a bounded pointer. + +This repo follows that steer. Each concern becomes a **separate experimental +extension** with its own [reverse-DNS identifier](https://github.com/modelcontextprotocol/modelcontextprotocol/blob/main/seps/2133-extensions.md#definition), +its own reference implementation, and its own path to a future Extensions +Track SEP. Drafts can graduate independently — directly addressing the "narrower +first cut" ask without throwing away the combinatoric value of the full set. + +See [docs/decisions.md](docs/decisions.md) for the decision record and +[docs/trust-model.md](docs/trust-model.md) for the shared enforcement model. + +## Extensions + +| Identifier | Status | What it specifies | Reference implementation(s) | +| :--- | :--- | :--- | :--- | +| [`io.modelcontextprotocol/trust-annotations`](specification/draft/trust-annotations.mdx) | Draft skeleton | **Primary extension.** A small, scheme-agnostic client-facing data-classification vocabulary (`sensitive`, `untrusted`) on result `_meta`, plus an optional `evidenceRef` pointer slot that carries richer payloads out-of-band. | Python SDK: [`kapil8811/mcp-trust-annotations`](https://github.com/kapil8811/mcp-trust-annotations) (138-test suite, healthcare demo, LLM usability study). | +| [`io.modelcontextprotocol/action-metadata`](specification/draft/action-metadata.mdx) | Draft skeleton | `inputMetadata` / `returnMetadata` / outcome classifiers (incl. `requires_review`) on `ToolAnnotations`, describing where inputs go, where outputs originate, and what real-world effects a tool can cause. | Carries forward [SEP-2061 (Action Security Metadata)](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2061) by [@rreichel3](https://github.com/rreichel3); reference impl per that proposal (`read_drafts` / `list_inbox` / `send_email`). | +| [`io.modelcontextprotocol/ifc-fides`](specification/draft/ifc-fides.mdx) | Draft skeleton | A **profile** of the `trust-annotations` `evidenceRef` slot: `type: "ifc.fides.v1"` carrying an integrity + confidentiality label for deterministic information-flow control, following the FIDES paper ([arXiv:2505.23643](https://arxiv.org/abs/2505.23643)). | Emitter candidate: [`github-mcp-server`](https://github.com/github/github-mcp-server) (does not emit IFC labels today — closing that gap is the proof point). | + +### Why FIDES is a profile, not a top-level extension + +Information-flow control is modelled as a profile rather than the namespace +root because IFC (an integrity × confidentiality lattice) is one enforcement +model among several that reviewers raised — capability tokens, caller/tool +cosigning, and sequence-shape audit records. A top-level `ifc/` root would bake +one academic model into the namespace and foreclose the others. As one reviewer +put it, IFC "fits relatively well if you use annotations" — an endorsement of +IFC *as a profile*, not as the wire root. As a `type` value under +`trust-annotations`'s open-ended `evidenceRef` slot, the FIDES work stays +first-class while every other model can occupy the same slot. + +## Relationship to SEP-1913 + +SEP-1913 remains the canonical place to discuss the overall problem framing. +This repository develops the schema-bearing parts of that proposal as +independently shippable extensions. When an extension here is ready to graduate, +an Extensions Track SEP can reference this repo as the prior art and the working +implementation that SEP-2133 [requires](https://github.com/modelcontextprotocol/modelcontextprotocol/blob/main/seps/2133-extensions.md#creation). + +For the full per-SEP plan — what happens to SEP-1913, SEP-2061, SEP-1862 and +others, and the SEP-2127 refactor precedent — see +[docs/sep-disposition.md](docs/sep-disposition.md). + +**Out of scope for these extensions** (see [docs/open-questions.md](docs/open-questions.md)): + +- **`maliciousActivityHint`** — reviewer concerns are structural (it fires at + `tools/resolve` before execution can produce evidence; a boolean is the wrong + granularity for client UX; clients won't trust server self-attestation). If it + returns, it is per-`ContentBlock` with spans, on a different clock. It stays + on the SEP-1913 umbrella rather than in an extension here. +- **Propagation rules** — sensitivity escalation across session boundaries, and + the sequence-shape gap (an annotation surface for "this was call N in a + flagged sequence") remain open. Likely a future extension once the taxonomy + and `evidenceRef` shape are stable. + +## Repository layout + +This repo mirrors the structure of official extension repositories such as +[`ext-auth`](https://github.com/modelcontextprotocol/ext-auth): + +``` +specification/draft/.mdx # one spec per extension +docs/ # decision log, open questions, related work +MAINTAINERS.md # IG facilitators +``` + +## Contributing + +See [CONTRIBUTING.md](CONTRIBUTING.md). Substantive design discussion happens +on PRs against the relevant `specification/draft/*.mdx` file, in the IG +Discord, and (for cross-extension concerns) on [SEP-1913](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913). diff --git a/docs/decisions.md b/docs/decisions.md new file mode 100644 index 0000000..33ecb8b --- /dev/null +++ b/docs/decisions.md @@ -0,0 +1,91 @@ +# Decision log + +Append-only record of design decisions for the Tool Annotations IG's trust / +privacy extension work. Newest at the bottom. + +## 2026-06-10 — Split SEP-1913 into independent extensions + +**Decision.** Split the schema-bearing parts of SEP-1913 into separate +experimental extensions, each with its own `io.modelcontextprotocol/…` +identifier and reference implementation, rather than pursuing one broad +Standards Track SEP. + +**Rationale.** @localden's review asked for a *narrower first cut*; the IG +[aligned 2026-05-28](https://github.com/modelcontextprotocol/modelcontextprotocol/discussions/2820) +on an extension-first strategy. Independent extensions can graduate on their own +clock and avoid hard-to-remove schema. + +## 2026-06-10 — Three initial extensions + +**Decision.** `trust-annotations` (primary), `action-metadata`, `ifc-fides`. + +**Rationale.** These are the three pieces with either a reference implementation +or an existing SEP behind them: Kapil's SDK, SEP-2061 (Reichel), and the FIDES +model respectively. + +## 2026-06-10 — FIDES is a profile, not a top-level extension + +**Decision.** Information-flow control is `type: "ifc.fides.v1"`, a profile of +the `trust-annotations` `evidenceRef` slot — not a top-level `io.modelcontextprotocol/ifc` +extension. + +**Rationale.** IFC is one enforcement model among several raised in review +(capability tokens — pshkv; cosigning — viftode4; sequence shape — marras0914). +A top-level `ifc/` root would foreclose those. Reviewer endorsement was for IFC +"if you use annotations" — i.e. as a profile. + +## 2026-06-10 — `evidenceRef.type` is an open string + +**Decision.** `type` MUST remain an open string with a non-binding registry of +well-known values; never a closed enum. Required fields are `digest` and +`canonicalization`; `schema` recommended; `ref` optional. + +**Rationale.** Adapted from the vaaraio / Rul1an convergence in the SEP-1913 +thread. An open `type` is what lets IFC, data-class, sequence-shape, and +attestation profiles share one slot. + +## 2026-06-10 — `requiresReview` moves to `action-metadata` + +**Decision.** `requiresReview` is an `action-metadata` field, not a +`trust-annotations` field. + +**Rationale.** It is a workflow/consent signal, not a data-classification +property. Keeping it out of the trust taxonomy avoids reproducing SEP-1913's +"several concerns in one schema" problem at smaller scale. + +## 2026-06-10 — DataClass demoted to a profile + +**Decision.** The wire taxonomy keeps only the coarse `sensitive` boolean. +The four-level classification + regulatory scope becomes an `evidenceRef` +profile `type: "data-class.v1"`. + +**Rationale.** Coarse binary is universally client-actionable and cheap on the +wire; the richer taxonomy can evolve behind a profile without a breaking schema +change. + +## 2026-06-10 — Parked: maliciousActivityHint, propagation rules + +**Decision.** Neither becomes an extension now; both stay on the SEP-1913 +umbrella. + +**Rationale.** `maliciousActivityHint` has unresolved structural objections +(fires pre-execution at `tools/resolve`; boolean granularity wrong for UX; +clients won't trust server self-attestation). Propagation/sequence-shape needs +the taxonomy and `evidenceRef` stable first. + +## 2026-06-10 — Citations: public sources only + +**Decision.** Reference implementations and motivating examples cite **public** +artifacts — [`github-mcp-server`](https://github.com/github/github-mcp-server), +[`kapil8811/mcp-trust-annotations`](https://github.com/kapil8811/mcp-trust-annotations), +[arXiv:2505.23643](https://arxiv.org/abs/2505.23643) — and index on the public +SEP-1913 review record (esp. @localden). Private/internal implementations are +not named or linked. + +## 2026-06-10 — Pre-flight (SEP-1862) stays core + +**Decision.** These extensions are response-level and do not depend on Tool +Resolution. SEP-1862 remains a core/Standards-Track protocol change. + +**Rationale.** The 2026-05-28 IG meeting concluded pre-flight is inherently a +protocol-level change, not an extension. diff --git a/docs/intent-comment.md b/docs/intent-comment.md new file mode 100644 index 0000000..ecc3bca --- /dev/null +++ b/docs/intent-comment.md @@ -0,0 +1,61 @@ +# Intent comment (draft, pre-post review) + +This is the comment we plan to post on +[SEP-1913](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913), +with an abbreviated pointer version for +[SEP-2061](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2061). +Kept here so it can be reviewed and stays in sync with +[sep-disposition.md](./sep-disposition.md). + +--- + +## For SEP-1913 + +> **Intent: split this SEP and migrate to the Extensions Track** +> > +> A note on direction for everyone following this thread. When SEP-1913 was +> first framed, the **Extensions Track** (SEP-2133) and the `experimental-ext-*` +> incubation process didn't exist in their current form. They now do, and +> they're a better fit for this work than a single Standards Track SEP. +> > +> Two things pushed us here: +> > +> - @localden's review ask for a **narrower first cut** — the concern that a +> broad taxonomy with array-or-scalar polymorphism is hard to remove or change +> once it lands. +> - The Tool Annotations IG's +> [May 28 decision](https://github.com/modelcontextprotocol/modelcontextprotocol/discussions/2820) +> to pursue trust/privacy as an **experimental extension first**, gather +> adoption evidence, then ask core maintainers to absorb anything. +> > +> So the plan is to \*\*split this proposal into a few small, +> independently-shippable extensions\*\*, each with its own +> `io.modelcontextprotocol/…` identifier, reference implementation, and path to +> an Extensions Track SEP. Incubation is in +> [`experimental-ext-tool-annotations`](https://github.com/modelcontextprotocol/experimental-ext-tool-annotations). +> > +> Kicking off with a few draft extensions in the tool annotations repo — not +> sure yet whether they'd each need separate repos eventually, or whether +> grouping them in one is fine. That's part of what incubation is for. +> > +> | Extension | Scope | +> |---|---| +> | `trust-annotations` | The narrow data-classification taxonomy (`sensitive`, `untrusted`) + an open-ended `evidenceRef` pointer for richer, out-of-band evidence. | +> | `action-metadata` | Tool I/O + outcome contract (folds in @rreichel3's SEP-2061). | +> | `ifc-fides` | Information-flow control ([arXiv:2505.23643](https://arxiv.org/abs/2505.23643)) as **one profile** of `evidenceRef`, not a wire root — the public/private-repo confidentiality case, with github-mcp-server as an emitter example. | +> > +> Deliberately removed: `maliciousActivityHint` (the structural concerns raised +> here are unresolved) and session-level propagation rules. +> > +> This follows the same Standards-Track → Extensions-Track refactor pattern as +> SEP-2127 (#2893). This PR will eventually pivot to the `trust-annotations` +> piece itself, with the other schema-bearing pieces moving out into their own +> extensions. Everything is still in the incubation phase, so naming, design, +> and the choice of what to put forward as an extension are all open for +> discussion in the IG. + +--- + +## For SEP-2061 (coordination note) + +> @rreichel3 — splitting this out into an independent extension as discussed: https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913#issuecomment-4675047154 \ No newline at end of file diff --git a/docs/open-questions.md b/docs/open-questions.md new file mode 100644 index 0000000..57deb26 --- /dev/null +++ b/docs/open-questions.md @@ -0,0 +1,45 @@ +# Open questions + +Tracked here rather than in the spec drafts, so the drafts stay non-temporal. + +## Cross-cutting + +- **Where does the policy-enforcement engine live** across different user + universes (cross-org, cross-domain)? Engines work well within one universe; + cross-domain is the hard case. (IG 2026-05-28.) +- **Cross-domain integrity verification** — is asymmetric crypto for domain + identity in scope for a future extension, or out of scope entirely? CLI tools + remain a persistent gap for enforcing these constraints. +- **`evidenceRef.type` registry** — who curates the list of well-known profile + types, and how do we coordinate with attestation SEPs (e.g. SEP-2787) so + values don't collide? + +## trust-annotations + +- Is `sensitive` the right single coarse signal, or do we need the + `data-class.v1` profile from day one? +- Content-block-level vs. result-level attachment — does the draft need a + worked multi-result example before it's implementable? +- `list_changed`: confirmed response-level annotations don't participate; revisit + only if trust vocabulary ever attaches to tool definitions. + +## action-metadata + +- Coexistence vs. replacement of legacy `destructiveHint` / `readOnlyHint` / + `idempotentHint` / `openWorldHint`. +- Open strings vs. closed enums for `destination` / `source` / `sensitivity`. +- Does `requiresReview` need a machine-readable *reason* for good client UX? + +## ifc-fides + +- Inline `_meta.ifc` for low-friction adoption vs. always behind `evidenceRef`. +- GitHub Enterprise `internal` repo visibility → public/private/reader-set + mapping (audience is the whole org, broader than collaborators). + +## Parked (SEP-1913 umbrella) + +- **`maliciousActivityHint`** — if it returns, it is per-`ContentBlock` with + spans, driven by the host's own detection, not a server-attested boolean. +- **Session-level propagation rules** — escalation semantics and the + sequence-shape gap ("this was call N in a flagged sequence" has no response + annotation surface today). diff --git a/docs/related-work.md b/docs/related-work.md new file mode 100644 index 0000000..f5717fe --- /dev/null +++ b/docs/related-work.md @@ -0,0 +1,38 @@ +# Related work + +External references and prior art relevant to the IG's trust / privacy +annotation work. Several were surfaced in IG meetings (notably 2026-05-28). + +## SEPs + +- [SEP-1913 — Trust and Sensitivity Annotations](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913) — the umbrella proposal these extensions derive from. +- [SEP-2061 — Action Security Metadata](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2061) — carried forward as `action-metadata`. +- [SEP-1862 — Tool Resolution / pre-flight checks](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1862) — core-protocol, composes with these extensions. +- [SEP-2133 — Extensions](https://github.com/modelcontextprotocol/modelcontextprotocol/blob/main/seps/2133-extensions.md) — the framework this repo incubates under. +- [SEP-2127 — Server Cards](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2893) — precedent for the Standards→Extensions Track refactor. +- [SEP-2787 — Tool Call Attestation](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2787) — candidate `evidenceRef` profile. + +## Research + +- **FIDES** — *Information-flow control for LLM agents.* [arXiv:2505.23643](https://arxiv.org/abs/2505.23643). Basis for the `ifc.fides.v1` profile. +- **Design Patterns for Securing LLM Agents** — IBM/Google/Microsoft. [arXiv:2506.08837](https://arxiv.org/abs/2506.08837). Plan-Then-Execute, Dual LLM, Map-Reduce, etc. +- **Trail of Bits** — prompt-injection via hidden content in GitHub issues. [blog](https://blog.trailofbits.com/2025/08/06/prompt-injection-engineering-for-attackers-exploiting-github-copilot/). +- **OpenAI Auto Review** — https://alignment.openai.com/auto-review/ (shared in IG chat). + +## Implementations & tooling + +- [`kapil8811/mcp-trust-annotations`](https://github.com/kapil8811/mcp-trust-annotations) — reference Python SDK PoC for `trust-annotations`. +- [`github-mcp-server`](https://github.com/github/github-mcp-server) — public MCP server; emitter candidate for `ifc-fides` (knows repo visibility + collaborators). +- **Ethyca** data-labeling docs — https://www.ethyca.com/docs (shared in IG chat). +- **GitHub Next** agentic-workflows research on data labeling — to be documented as issues in this repo (IG action item, @gokhanarkan / @joannakl). + +## Adjacent community proposals (from the SEP-1913 thread) + +- **SINT Protocol** (capability-token constraint enforcement) — pshkv. +- **in-toto** attestations as a trust-annotation substrate. +- **OVERT 1.0** envelope shape for runtime evidence. +- Caller/tool **cosigning** model — viftode4. +- **Sequence-shape** policies — marras0914. + +These are exactly the models that `evidenceRef`'s open `type` is designed to +accommodate as profiles. diff --git a/docs/sep-disposition.md b/docs/sep-disposition.md new file mode 100644 index 0000000..ab22421 --- /dev/null +++ b/docs/sep-disposition.md @@ -0,0 +1,110 @@ +# SEP disposition: what happens to the existing proposals + +This document explains how the existing trust/privacy/annotation SEPs map onto +the experimental extensions incubated in this repository, and what is proposed +to happen to each SEP. It exists so that anyone arriving from one of those PRs +can understand the plan without reading the whole thread. + +> **Status:** proposal / options. Nothing here is decided until reflected in the +> relevant SEP PRs. The migration follows the precedent set by +> [SEP-2127 → Extensions Track (#2893)](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2893). + +## Why anything changes + +When [SEP-1913](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913) +was first framed, the **Extensions Track** ([SEP-2133](https://github.com/modelcontextprotocol/modelcontextprotocol/blob/main/seps/2133-extensions.md)) +and the `experimental-ext-*` incubation process did not exist in their current +form. Two inputs since then point at a different shape: + +1. **Sponsor steer.** [@localden](https://github.com/localden) asked for a + *narrower first cut* — a single broad taxonomy with array-or-scalar enum + polymorphism is hard to remove or change once shipped. +2. **IG decision.** The Tool Annotations IG + [aligned on 2026-05-28](https://github.com/modelcontextprotocol/modelcontextprotocol/discussions/2820) + to pursue this work as an **experimental extension first**, build an + adoption/evidence base, and only then ask core maintainers to absorb + anything into the protocol. + +The result: split the schema-bearing parts into small, independent extensions, +each able to graduate on its own clock. + +## The precedent: SEP-2127 (Server Cards) + +[#2893](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2893) +refactored SEP-2127 from **Standards Track** to **Extensions Track**: + +- Frontmatter `Type: Standards Track` → `Type: Extensions Track`, plus an + `Extension Identifier` and "on behalf of the WG" attribution. +- A top-of-file `` pointing at the experimental repo as the spec home. +- The SEP body **slimmed to a charter** — Abstract, Motivation, Rationale, a + high-level Specification *pointer*, security posture *summary* — with the + detailed normative wire format delegated to the extension repo. +- The SEP body kept **non-temporal**: a published SEP is frozen, so in-flight + "open items" live in the PR description and extension-repo issues, not in the + SEP text. + +We apply the same playbook below. + +## Per-SEP disposition + +### SEP-1913 — Trust and Sensitivity Annotations + +**Proposed:** becomes the **umbrella / problem-framing** thread. The schema- +bearing content moves into the extensions below. Options, in order of +preference: + +- **(A, preferred)** Keep the PR open as the framing umbrella; add an intent + comment (see [below](#intent-comment)); later, either refactor it to an + Extensions Track *charter* that points here (SEP-2127 shape) **or** close it + in favor of per-extension Extensions Track SEPs once those are ready. +- **(B)** Refactor 1913 itself into the `trust-annotations` Extensions Track + SEP and spin the others off as siblings. +- **(C)** Close 1913 outright and open three fresh Extensions Track SEPs. Loses + the discussion history's continuity; not preferred. + +**Moved into extensions:** `trust-annotations`, `action-metadata`, `ifc-fides`. +**Parked on the umbrella:** `maliciousActivityHint`, +session-level propagation rules. See [open-questions.md](./open-questions.md). + +### SEP-2061 — Action Security Metadata + +**Proposed:** becomes the [`action-metadata`](../specification/draft/action-metadata.mdx) +extension. SEP-2061 is by [@rreichel3](https://github.com/rreichel3), who is +also an IG co-facilitator and SEP-1913 co-author, so this is a fold-in, not a +collision. Disposition mirrors 1913 option (A): keep the thread as the field- +semantics discussion, add a pointer comment linking it to the +`action-metadata` extension, refactor to Extensions Track when ready. + +### SEP-1862 — Tool Resolution (pre-flight checks) + +**Proposed:** **stays Standards Track / core.** The 2026-05-28 IG meeting +concluded pre-flight checks are inherently a protocol-level change, not an +extension. These extensions are deliberately **response-level** (`_meta` on +results, static `ToolAnnotations`) and do **not** depend on 1862. They compose +with it if it lands, but do not block on it. + +### Other related SEPs (not owned here) + +- **SEP-1984 (Comprehensive Tool Annotations)**, **SEP-2417 (Model Preferences + for Tools)** — tracked by the IG as discussion items; not part of these + extensions. Cross-link only. +- **SEP-2787 (Tool Call Attestation)** and the various attestation/evidence + threads — these are natural `evidenceRef` *profile* candidates rather than + competitors. Coordinate so the `evidenceRef.type` registry can list them. + +## Mapping table + +| SEP | Title | Proposed disposition | Extension home | +| :-- | :-- | :-- | :-- | +| 1913 | Trust & Sensitivity Annotations | Umbrella thread; schema moves to extensions | `trust-annotations` (+ `ifc-fides`) | +| 2061 | Action Security Metadata | Fold into extension | `action-metadata` | +| 1862 | Tool Resolution (pre-flight) | Stays core / Standards Track | — (composes, no dependency) | +| 1984 | Comprehensive Tool Annotations | IG discussion item | — | +| 2417 | Model Preferences for Tools | IG discussion item | — | +| 2787 | Tool Call Attestation | Candidate `evidenceRef` profile | (future) | + +## Intent comment + +The text we plan to post on SEP-1913 (and, abbreviated, on SEP-2061) lives in +[intent-comment.md](./intent-comment.md) so it can be reviewed before posting +and kept in sync with this document. diff --git a/docs/trust-model.md b/docs/trust-model.md new file mode 100644 index 0000000..98774e4 --- /dev/null +++ b/docs/trust-model.md @@ -0,0 +1,53 @@ +# Trust model + +A single statement of the enforcement model shared by all extensions in this +repository, so individual specs don't re-litigate it. + +## Annotations are claims, not guarantees + +An annotation on a tool result or definition is a **claim** made by whoever +produced it. Nothing in these extensions assumes the producer is honest or +competent. The value of an annotation comes from two things: + +1. **Verifiability.** Where a claim needs to be trusted, the `evidenceRef` + pointer (see [`trust-annotations`](../specification/draft/trust-annotations.mdx)) + lets a consumer resolve and check the evidence behind it — re-hash the + referenced record, verify a signature, check an inclusion proof — rather + than taking the claim on faith. +2. **Accountability.** Enforcement lives with the parties that can impose + consequences: **registries and marketplaces** that admit servers, + **hosts/clients** that gate actions, and **operators** that set policy. The + annotation gives those parties a machine-readable surface to act on. + +This mirrors the framing repeated throughout the SEP-1913 discussion: trust +comes from the ecosystem verifying annotations, not from developer good faith. +A server that lies in its annotations is a server a registry can refuse to list +and a host can refuse to trust — the same accountability model as any other +declared capability. + +## Defense in depth, not a single gate + +As framed in the IG's inaugural meeting, these annotations are **defense in +depth**: they reduce the likelihood of unintended actions (unnecessary +destructive operations, data crossing a boundary it shouldn't), they don't +claim to be a complete security boundary. A host SHOULD combine them with its +own checks (its own injection detection, its own policy engine) rather than +treating any single annotation as authoritative. + +## Human-in-the-loop on the risky edges + +A recurring pattern from the IG research (Joanna's "phases" work, and the +SEP-1913 thread): rather than blanket-blocking flows a policy engine is unsure +about, **flag the specific call for user confirmation**. This preserves utility +while keeping a human on the genuinely risky edges, and is the recommended +default for `requiresReview` ([`action-metadata`](../specification/draft/action-metadata.mdx)) +and for IFC policy violations ([`ifc-fides`](../specification/draft/ifc-fides.mdx)). + +## Cross-domain is the hard case + +Policy engines work well inside a single user/organization universe and +struggle across universes (cross-org, cross-domain flows). These extensions +provide the *signal* (where data came from, how sensitive it is, what a tool +does with it); they do not solve cross-domain enforcement on their own. That +remains an [open question](./open-questions.md) and a likely area for future +work (e.g. asymmetric crypto for domain integrity verification). diff --git a/specification/draft/action-metadata.mdx b/specification/draft/action-metadata.mdx new file mode 100644 index 0000000..5ea8d8c --- /dev/null +++ b/specification/draft/action-metadata.mdx @@ -0,0 +1,132 @@ +--- +title: Action Metadata +--- + +**Protocol Revision**: draft + +**Extension identifier:** `io.modelcontextprotocol/action-metadata` + +> ⚠️ **Experimental draft skeleton.** This carries forward +> [SEP-2061: Action Security Metadata](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2061) +> by [@rreichel3](https://github.com/rreichel3) into the IG's experimental repo, +> per the May 28 2026 decision to pursue trust/privacy work as an extension +> first. SEP-2061 remains the canonical discussion thread for the field +> semantics. + +## Abstract + +This extension adds a small, declarative contract to a tool's static +`ToolAnnotations` describing **what the tool does with data**: where inputs may +go, where outputs originate, and what real-world outcome invoking it can cause. +Where [`trust-annotations`](./trust-annotations.mdx) classifies *data in +transit*, this extension classifies *tool behavior*. The two are complementary +and independently adoptable — a client can consume action metadata without +implementing trust annotations at all. + +## Motivation + +MCP today treats all tool calls as equivalent at the protocol level beyond the +coarse `readOnlyHint` / `destructiveHint` / `idempotentHint` / `openWorldHint` +hints. A tool that reads drafts and a tool that sends email are otherwise +indistinguishable, even though their privacy and consent implications differ +radically. Runtimes fall back to inferring risk from tool names or model +behavior, which does not scale. + +This was reinforced in the May 28 2026 IG meeting: a model often **cannot tell +whether a target is private or public**, and absent that signal it may push +content somewhere it should not. A declarative behavioral contract lets clients +and models make safer decisions without baking domain knowledge into every +model. + +The canonical worked example from SEP-2061: `read_drafts`, `list_inbox`, and +`send_email` can share an identical JSON Schema yet have completely different +security semantics — only action metadata distinguishes them. + +## Specification + +### Dependencies + +This extension annotates the existing `ToolAnnotations` object returned by +`tools/list`. It has no dependency on `trust-annotations` or on Tool Resolution. + +### Fields + +Carried under the extension-namespaced key on `ToolAnnotations`: + +```jsonc +{ + "annotations": { + "io.modelcontextprotocol/action-metadata": { + "inputMetadata": { + "destination": "external", // where input data may be stored/sent + "sensitivity": "personal" // kind of data the tool accepts + }, + "returnMetadata": { + "source": "open-world", // where returned data originates + "sensitivity": "public" + }, + "outcome": "consequential", // benign | consequential | irreversible + "requiresReview": true // host SHOULD seek human confirmation + } + } +} +``` + +| Field | Meaning | +| :--- | :--- | +| `inputMetadata.destination` | Where data passed to the tool may end up (e.g. `local`, `internal`, `external`). | +| `inputMetadata.sensitivity` | The kind of data the tool is designed to accept. | +| `returnMetadata.source` | Where the tool's returned data originates (e.g. `first-party`, `open-world`). | +| `returnMetadata.sensitivity` | The kind of data the tool is designed to return. | +| `outcome` | Real-world effect class: `benign` / `consequential` / `irreversible`. | +| `requiresReview` | The tool author signals that a host SHOULD obtain explicit human confirmation before invocation. | + +> Exact enum value sets are inherited from SEP-2061 and are **not** re-litigated +> here; this draft tracks that proposal. Where SEP-2061 evolves, this file +> follows. + +### `requiresReview` lives here, deliberately + +`requiresReview` is a **workflow/consent** signal, not a data-classification +property. It was intentionally moved out of [`trust-annotations`](./trust-annotations.mdx) +(which stays strictly data-classifying) to avoid reproducing SEP-1913's +"several concerns in one schema" problem at smaller scale. It sits next to +`outcome` because both describe the *act* of calling the tool rather than the +*data* in flight. + +### Lifecycle and `list_changed` + +These fields are part of the **tool definition** (`ToolAnnotations`). They are +therefore covered by `tools/list_changed`: a server that changes a tool's +action metadata MUST emit `list_changed` as it would for any tool-definition +change. (This is the opposite of `trust-annotations`, which is response-level.) + +## Relationship to existing annotations + +`outcome: irreversible` overlaps conceptually with `destructiveHint` but is +strictly richer (a three-way classification vs. a boolean) and is scoped to the +real-world effect rather than to whether the operation is destructive to +server-side state. The IG will need to decide whether action metadata +*supersedes* or *coexists with* the legacy hints before any graduation. + +## Reference implementation + +Per [SEP-2061](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2061): +the `read_drafts` / `list_inbox` / `send_email` worked example with identical +schemas and divergent action metadata. A public MCP server emitting these +fields (candidate: [`github-mcp-server`](https://github.com/github/github-mcp-server)) +would anchor the draft in a real ecosystem. + +## Open questions + +- Coexistence vs. replacement of `destructiveHint` / `readOnlyHint`. +- Whether `destination` / `source` / `sensitivity` enums should be open strings + (consistent with `evidenceRef.type`) or closed enums. +- Whether `requiresReview` needs a machine-readable *reason* (vs. a bare + boolean) to drive good client UX. + +## Changelog + +| Date | Change | +| ---------- | ------------------------------------------------------------------- | +| 2026-06-10 | Initial draft skeleton, carrying SEP-2061 into the experimental repo; absorbed `requiresReview` from the trust taxonomy. | diff --git a/specification/draft/ifc-fides.mdx b/specification/draft/ifc-fides.mdx new file mode 100644 index 0000000..328ed17 --- /dev/null +++ b/specification/draft/ifc-fides.mdx @@ -0,0 +1,127 @@ +--- +title: Information-Flow Control (FIDES profile) +--- + +**Protocol Revision**: draft + +**Extension identifier:** `io.modelcontextprotocol/ifc-fides`  ·  **Profile of:** `io.modelcontextprotocol/trust-annotations` + +> ⚠️ **Experimental draft skeleton.** This defines a *profile* of the +> [`trust-annotations`](./trust-annotations.mdx) `evidenceRef` slot. It is **not** +> a standalone wire root — see [Why a profile](#why-a-profile). + +## Abstract + +This extension defines `ifc.fides.v1`, a profile of the `trust-annotations` +`evidenceRef` slot that carries an **information-flow-control label** — +integrity plus confidentiality — following the FIDES model +([arXiv:2505.23643](https://arxiv.org/abs/2505.23643)). A host that implements +deterministic information-flow control can consume these labels to decide +whether a tool call is permitted, without baking the IFC model into the core +protocol or into the `trust-annotations` wire surface. + +## Why a profile + +Information-flow control is one enforcement model among several that reviewers +of SEP-1913 raised — capability tokens, caller/tool cosigning, and +sequence-shape audit records were all put forward. A top-level extension +(`io.modelcontextprotocol/ifc`) would make the FIDES integrity × confidentiality +lattice the namespace root and silently foreclose those other models. + +As a `type` value under the open-ended `evidenceRef` slot, the FIDES label is +first-class while the slot stays free for every other model. One reviewer's +framing captured it: IFC "fits relatively well *if you use annotations*" — an +endorsement of IFC as a profile, not as the wire root. + +## Motivation + +The motivating case is the one raised in the +[2026-05-28 IG meeting](https://github.com/modelcontextprotocol/modelcontextprotocol/discussions/2820): +**a model often cannot tell whether a repository is public or private**, and +lacking that signal it may push private content to a public destination. An IFC +label lets the host track confidentiality (who may read this data) and +integrity (is this data trusted) as context accumulates across tool calls, and +deny or prompt before a flow violates policy. + +A public MCP server is the natural emitter. [`github-mcp-server`](https://github.com/github/github-mcp-server) +returns repository data whose confidentiality is determined by repo visibility +and collaborator sets — exactly the public/private signal above — but does +**not** emit IFC labels today. Closing that emitter gap is the concrete proof +point for this profile: a host-side consumer of the label shape already exists, +so the missing half is a server willing to emit it. + +## Specification + +### Profile identity + +This profile is selected by `evidenceRef.type == "ifc.fides.v1"` on a +`trust-annotations` annotation. A client that does not implement IFC MUST be +able to ignore it safely (the surrounding `sensitive` / `untrusted` booleans and +the `digest`/`canonicalization` pair remain meaningful). + +### Label payload + +The record referenced by the `evidenceRef` (and, for low-friction adoption, MAY +be inlined by deployments that accept the wire cost) has the shape: + +```jsonc +{ + "integrity": "trusted", // "trusted" | "untrusted" (FIDES §4.1 two-level lattice) + "confidentiality": "public" // "public" | "private" | ["login1","login2", …] +} +``` + +| Field | Meaning | +| :--- | :--- | +| `integrity` | Two-level integrity lattice (`trusted` ⊑ `untrusted`): trusted data may flow to untrusted sinks, not vice versa. | +| `confidentiality` | `"public"` = world-readable; `"private"` = an opaque marker the host resolves to a reader set (e.g. repo collaborators); an explicit `string[]` = pre-resolved reader logins. Fewer readers = more confidential = higher in the lattice. | + +### Label semantics + +- **Join on accumulation.** As a session ingests labeled results, the context + label is the *join* of what it has seen: integrity degrades toward + `untrusted`, confidentiality narrows toward the smallest reader set. +- **Policy check before egress.** Before a write/egress tool call, the host + checks whether the current context label may flow to the call's target. When + a label is absent, the host falls back to its default (trusted-action) + policy rather than assuming the worst — labels are an *additive* signal. +- **Confidentiality resolution.** `"private"` is intentionally opaque on the + wire; resolving it to a concrete reader set (e.g. via a collaborators lookup) + is a host concern, so servers need not enumerate audiences inline. + +> The normative integrity/confidentiality lattice definitions follow the FIDES +> paper, §4.1 and §4.3. This draft references the model rather than restating +> the proofs. + +### Relationship to `trust-annotations` + +`ifc-fides` never appears without a host `trust-annotations` annotation +carrying the `evidenceRef`. The booleans are the universally-actionable signal; +the IFC label is the precise, host-checkable evidence behind them. + +## Reference implementation + +- **Consumer:** a host-side IFC engine that parses the `{integrity, + confidentiality}` label, maintains a context label across tool results, and + applies a flow policy before egress operations already exists in practice. + (Linked once a public reference is available.) +- **Emitter (gap / proof point):** [`github-mcp-server`](https://github.com/github/github-mcp-server) + is the candidate — it already knows repo visibility and collaborator sets, + which are exactly the confidentiality inputs. + +## Open questions + +- Should the label be inlinable on `_meta.ifc` directly for low-friction + adoption, or always behind `evidenceRef` for schema minimalism? (Lean: + permit both; `evidenceRef` is canonical, inline is a convenience.) +- How does GitHub Enterprise `internal` repo visibility map onto the + public/private/reader-set confidentiality model? (Audience is the whole org, + strictly broader than collaborators — likely falls back to default policy.) +- Registry coordination with other attestation/evidence profiles + (e.g. SEP-2787) so `evidenceRef.type` values don't collide. + +## Changelog + +| Date | Change | +| ---------- | ------------------------------------------------------------ | +| 2026-06-10 | Initial draft skeleton. Reframed from a top-level `ifc` extension to a `trust-annotations` `evidenceRef` profile (`ifc.fides.v1`). | diff --git a/specification/draft/trust-annotations.mdx b/specification/draft/trust-annotations.mdx new file mode 100644 index 0000000..0fc51d4 --- /dev/null +++ b/specification/draft/trust-annotations.mdx @@ -0,0 +1,176 @@ +--- +title: Trust Annotations +--- + +**Protocol Revision**: draft + +**Extension identifier:** `io.modelcontextprotocol/trust-annotations` + +> ⚠️ **Experimental draft skeleton.** This document captures the agreed shape +> and the open questions. Normative text is intentionally thin pending +> reference-implementation validation. Substantive discussion happens on PRs +> against this file and on [SEP-1913](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913). + +## Abstract + +This extension defines a small, stable, scheme-agnostic vocabulary for +classifying **data in transit** through MCP tool results, plus an optional +`evidenceRef` pointer that lets a deployment attach richer, out-of-band +evidence without growing the on-wire schema. It is the primary +data-classification extension in the Tool Annotations IG's trust work; other +extensions (information-flow control, action metadata) compose with it rather +than duplicating it. + +The design follows the [@localden](https://github.com/localden) review steer on +SEP-1913 — take a *narrow first cut* of the taxonomy and avoid hard-to-remove +schema — while preserving the layered "small annotation on the wire, rich +evidence out-of-band" consensus that emerged in the SEP-1913 thread. + +## Motivation + +Data crosses tool boundaries today with no standardized markers for whether it +is sensitive or whether it originated from an untrusted source. Clients and +hosts are left to infer this from tool names or model behavior. Two coarse, +broadly-applicable signals cover the majority of client-actionable cases: + +- **`sensitive`** — the content should be treated as confidential (PII, + credentials, proprietary data). Drives consent prompts and egress policy. +- **`untrusted`** — the content originated from an open-world / attacker- + influenceable source (web pages, third-party email, user-generated content). + Drives prompt-injection defenses. + +Anything richer than these two booleans is deliberately **not** on the wire; it +hangs off `evidenceRef` (see below). + +## Specification + +### Dependencies + +This extension depends only on the base MCP `_meta` mechanism. It does not +require Tool Resolution ([SEP-1862](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1862)), +though it composes with it. + +### Annotation shape + +Trust annotations are carried under the extension-namespaced `_meta` key on a +`CallToolResult` (and MAY appear on an individual `ContentBlock` — see +[Attachment point](#attachment-point)). + +```jsonc +{ + "_meta": { + "io.modelcontextprotocol/trust-annotations": { + "sensitive": true, // optional boolean + "untrusted": true, // optional boolean + "evidenceRef": { // optional pointer, see below + "type": "data-class.v1", + "digest": "sha256:…", + "canonicalization": "cbor/rfc8949", + "schema": "https://…/data-class.v1.json", + "ref": "audit://…" // optional locator + } + } + } +} +``` + +Both booleans are optional and default to `false`/absent. Absence MUST be +treated as "no claim made," never as "asserted false." + +### The `evidenceRef` slot + +`evidenceRef` is the extension point that keeps the wire schema small while +letting deployments attach arbitrarily rich evidence. Its shape (adapted from +the SEP-1913 discussion): + +| Field | Required | Meaning | +| :--- | :--- | :--- | +| `type` | yes | **Open string** naming the class of the referenced record. NOT an enum. Examples: `"data-class.v1"`, `"ifc.fides.v1"`, `"sequence"`, `"policy-decision"`. | +| `digest` | yes | Hash of the referenced record, so a client holding the data can re-derive it. | +| `canonicalization` | yes | How the digest was computed (e.g. `"cbor/rfc8949"`), so the client can re-hash independently. | +| `schema` | recommended | Identifier/version of the record `ref` resolves to. | +| `ref` | optional | Locator into the deployment's audit/evidence stream. | + +> **Normative intent:** `type` MUST remain an open string. Narrowing it to a +> closed enum would foreclose the capability-token, cosigning, and +> sequence-shape models raised in SEP-1913 review. A non-binding **registry** +> of well-known `type` values is maintained in this repo; unknown `type` values +> MUST be safely ignorable by a client that does not understand them (the +> `digest`/`canonicalization` pair is still a usable, bounded signal). + +This single slot subsumes the previously separate `attestationChainRef` / +`policyDecisionRef` ideas — both become `type` values. + +### Coarse vs. rich classification (DataClass) + +SEP-1913 carried a four-level data classification +(`public` / `personal` / `confidential` / `highly_confidential`) plus a +regulatory scope (e.g. `confidential:hipaa`). **This extension deliberately +keeps only the coarse `sensitive` boolean on the wire.** Richer classification +is recovered as an `evidenceRef` profile: + +```jsonc +"evidenceRef": { + "type": "data-class.v1", + // record resolves to e.g. { "class": "highly_confidential", "regulatory": ["hipaa"] } + "digest": "sha256:…", + "canonicalization": "cbor/rfc8949" +} +``` + +This is an explicit scope decision: the binary lives on the wire for universal +client actionability; the taxonomy lives behind a profile so it can evolve +without a breaking schema change. + +### Attachment point + +Annotations attach at the **`CallToolResult` level by default.** A server MAY +additionally annotate an individual `ContentBlock` when it has reason to +localize the signal (e.g. one search result among many is untrusted). When both +are present, the content-block annotation refines the result-level one for that +block; it MUST NOT *weaken* a result-level claim (union semantics — once +`true`, stays `true`). + +### Lifecycle and `list_changed` + +Trust annotations defined here are **response-level**: they describe a specific +tool result and are not part of the tool *definition*. They therefore do **not** +participate in `tools/list_changed`. (If a future revision attaches trust +vocabulary to tool definitions, that surface would follow `list_changed`; this +draft does not.) + +### Propagation + +This extension does **not** specify session-level escalation/propagation rules. +Those remain an [open question](../../docs/open-questions.md) on the SEP-1913 +umbrella. A host MAY implement propagation locally; this extension only +standardizes the per-result annotation. + +## Reference implementation + +[`kapil8811/mcp-trust-annotations`](https://github.com/kapil8811/mcp-trust-annotations) +— a Python SDK with a `@trust_annotated` decorator, `to_wire`/`from_wire` +round-tripping, a policy engine (audit/warn/enforce), a healthcare-scenario +demo, and an LLM-based usability study (138 tests). The SDK predates this +narrowed shape and is being aligned to the two-boolean + `evidenceRef` model. + +## Trust model + +Enforcement does not rest on developer honesty. See +[docs/trust-model.md](../../docs/trust-model.md): registries and marketplaces +verifying annotations are the enforcement layer; the annotation is a *claim*, +and `evidenceRef` is how that claim is made checkable. + +## Open questions + +- Is `sensitive` the right single coarse signal, or do we need `sensitive` + + the `data-class.v1` profile from day one? +- Exact required-vs-recommended split on `evidenceRef` fields. +- Whether content-block-level annotation needs a worked multi-result example + before the draft is implementable. + +## Changelog + +| Date | Change | +| ---------- | ------------------------------------------------------------- | +| 2026-06-10 | Initial draft skeleton. Narrowed to `sensitive` + `untrusted` + `evidenceRef`; DataClass demoted to a profile; `requires_review` moved to `action-metadata`. |