Skip to content

Adds higher level guiding principles for the extension framework#8317

Draft
therealjohn wants to merge 2 commits into
Azure:mainfrom
therealjohn:extension-principles
Draft

Adds higher level guiding principles for the extension framework#8317
therealjohn wants to merge 2 commits into
Azure:mainfrom
therealjohn:extension-principles

Conversation

@therealjohn
Copy link
Copy Markdown
Contributor

While building the agents extension, we came across many decisions trying to figure out the right shape for extensions and were constantly bumping up against questions like "should this even be an extension", or "should we separate this into multiple extensions", etc. As more partner teams integrate, we need clear guidance for contributions that aren't gated by our team, nor have a constantly shifting goal post.


This pull request adds a comprehensive new guide, extension-principles.md, outlining the core design and behavior principles for building azd extensions. The document helps developers determine whether a feature should be an extension, what kind it should be, and how it should interact with the broader azd ecosystem. It provides a checklist, identity and behavior principles, a decision rubric, and references to related guides.

Key additions and themes:

Extension Identity and Categorization:

  • Defines three extension categories—workload, platform, and resource—and provides criteria for each, including a "journey-continuity test" to help decide if a feature belongs in an extension or elsewhere.
  • Emphasizes having one canonical extension per workload or resource type, discouraging duplication across extensions.

Behavioral Best Practices:

  • Establishes principles for extension behavior, such as using the azd lifecycle for declared workloads, supporting both producer and consumer workflows, and ensuring commands can operate without project context when appropriate.
  • Mandates that shared primitives (like endpoint resolution or credential factories) must live in the platform extension to avoid duplication.
  • Details dependency management, requiring that dependencies flow downward through the platform layer, not directly between sibling extensions.

Copy link
Copy Markdown
Collaborator

@kristenwomack kristenwomack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi John,

Thank you for putting this together, and thank you for being such a dedicated partner in shaping this framework. The agents extension surfaced many of the questions this doc answers, and the principles are sharper because you stayed with them long enough to see the patterns. The journey-continuity test in P1 is the question I want partner teams asking before they file a new extension, and the decision rubric is the thing they'll paste into their design docs.

I'd like to work through the changes below before I approve. A few of these touch product calls I should help carry, so let's pair on those wherever it's useful.

Requested changes:

  1. Swap "family" for "namespace" throughout. "Family" isn't defined in the doc and reads fuzzy. "Namespace" matches azure.ai.* and lines up with how partner teams already talk about it. Where you mean the group of extensions sharing a namespace, "extension group" works too.

  2. Name the audiences up top. Two sentences in the intro covering who this is for (partner teams contributing extensions, our own team, and producers vs. consumers as end users) would anchor the rest. Producer and consumer don't show up until P5, and pulling them forward helps. Happy to draft those two sentences if it's useful.

  3. Add a governance hook for P4 tie-breaks and P12 conventions. The intro promises contributions that aren't gated by our team, and P4 and P12 both need someone to arbitrate when partner teams disagree. This one is a product call as much as a doc call, so let's work it together. I can take a first pass on a short section covering who owns the conventions doc per namespace, how a partner team proposes a deviation, and where it gets recorded (likely an ADR under docs/architecture/). Would you rather fold it in here, or land this PR and tackle governance as a follow-up?

  4. Set a minimum telemetry bar for promotion. P8 is qualitative, and P10 says official-registry extensions are "telemetry-instrumented per P8." There's a gap between them. Let's work the concrete bar together based on what you learned shipping agents (probably required funnel events for init, provision, deploy, and invoke where applicable, plus structured ServiceError or LocalError on all failure paths). For this PR a placeholder pointer from P10 is plenty.

  5. Make the resource-extension graduation path an expectation, not a side note. P2 mentions resource extensions "can later graduate to a declarative form." Declarative is the direction we want partner teams pointed at, so worth stating as the expected path rather than a maybe. Otherwise teams optimize for the imperative surface and we inherit the lock-in.

  6. Add a short "not an extension" list at the top of the rubric. Three bullets would give partner teams the fastest off-ramp: one-to-one Azure API wrappers go to Azure CLI, universal helpers go to core, and standalone tools ship standalone. Right now they have to read the whole rubric to get there.

  7. Say where approved deviations get recorded. The doc repeats "deviate when you have a real reason," which is the right stance. One sentence on where the deviation lives (the extension's README for local ones, an ADR for namespace-wide ones) closes the loop. Happy to set up the ADR template side if we go that route.

Items 3, 4, and 7 are ones I'd like to partner on with you, since you've already lived the answers. The others are doc tweaks whenever you have a minute.

Let me know what makes sense to fold in here versus pick up next.

Thanks, Kristen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/extensions Extensions (general)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants