feat(routines): add azd ai routine commands by huimiu · Pull Request #8241 · Azure/azure-dev

huimiu · 2026-05-19T07:10:26Z

Summary

Implements azd ai routine <verb> in the azure.ai.routines extension. Scoped to the subset of the Foundry Routines TypeSpec (azure-rest-api-specs PR #43186) that the deployed Foundry data plane supports end-to-end today.

A routine pairs one trigger (when) with one action (what) on a Foundry project — e.g. "at 8 AM UTC on Jan 2, invoke the daily-report agent".

Closes #8159

Included

Command	Notes
`routine create <name>`	`--trigger timer` or `--file` manifest; `--force` upsert
`routine update <name>`	GET-then-PUT merge; immutable trigger/action type-switch guard
`routine show` / `list` / `delete`	Standard CRUD; clean 404 mapping
`routine enable` / `disable`	Idempotent
`routine dispatch <name>`	`--input`, `--async`; surfaces `dispatchId` + `actionCorrelationId` + `taskId`
`routine run list <name>`	Auto-paged; `--top`, `--filter`

Trigger: timer (--at, --time-zone).
Action: agent-response (default; --agent-id or --agent-endpoint-id) or agent-invoke (--agent-endpoint-id).

Deferred (re-add when the Foundry service is ready)

Deferred	Why	What to flip when ready
`--trigger recurring` / `--cron`	Service 500s on every `schedule` create	Re-add flags + `buildTrigger` case
`--trigger github-issue`	Workspace connection model not yet wired	Add flags + `buildTrigger` case
`POST :enable` / `:disable`	Service 404s	Switch client from GET+PUT
`POST :dispatch_async` (snake)	Service 404s; only `:dispatchAsync` works	Flip action segment in `DispatchRoutineAsync`
`agent_name` field rename	Service still expects `agentId`	Rename `AgentID` JSON tag
`github_issue_opened` value	Service rejects (400)	Update `TriggerCLIToWire`
`AgentsPagedResult<T>` envelope	Service returns `value` / `nextLink`	Update paged models + pagination loop

Each divergence is also documented inline in code and in cli/azd/extensions/azure.ai.routines/AGENTS.md.

Tests

go build ./... and go test ./... pass. Manual E2E against a live Foundry project covers all included commands (timer create, JSON/YAML manifest, duplicate guard, --force upsert, update, enable/disable, dispatch sync + async + taskId, run list pagination + --top, delete + 404 verification, and all client-side validators).

Add the full v1 routine command subtree to the azure.ai.routines extension as specified in the design spec (PR #8200). Commands implemented: - routine create, update, show, list, delete - routine enable, disable (dedicated idempotent action routes) - routine dispatch (calls dispatch_async, --async flag for client-side wait) - routine run list (auto-paging, --top, --filter) New packages: - internal/exterrors/ -- structured error codes and helpers - internal/pkg/routines/ -- data-plane HTTP client and models - internal/cmd/endpoint.go -- 5-level project endpoint resolver Wire format: trigger/action as Record with 'default' key. All calls include x-ms-foundry-features-opt-in: Routines=V1Preview header. Also adds the design spec at cli/azd/docs/design/ai-routine-design-spec.md.

…dels - Add readAzdProjectSourcesFunc seam to endpoint.go for daemon isolation - endpoint_test.go: isFoundryHost, validateProjectEndpoint, full cascade tests - routine_create_test.go: buildTrigger and buildAction table tests - routine_manifest_test.go: readRoutineManifest (JSON/YAML), mergeRoutineFromFile, applyUpdateFlags, getTrigger/getAction - models_test.go: TriggerCLIToWire and ActionCLIToWire completeness - Add yaml struct tags to models.go for YAML manifest support

- Extract stubAzdProjectSources() helper (mirrors stubAzdHostedSources in agents) - isolateFromAzdDaemon now also clears AZD_SERVER env var - Add t.Parallel() to all pure-function tests (isFoundryHost, validateProjectEndpoint, buildTrigger, buildAction, mergeRoutineFromFile, applyUpdateFlags, getTrigger/getAction, TriggerCLIToWire/ActionCLIToWire map checks)

- cspell.yaml: add exterrors, sess, routineName, azdProjectSources to word list - endpoint.go: remove unused projectEndpointPathPrefix constant - routine_create.go: wrap long buildAction() call (line >125 chars) - routine_update.go: wrap long --file flag help text - routine_manifest_test.go: expand inline map literals to multi-line - client.go: wrap ListRoutineRuns signature to fit 125-char limit - Run gofmt -w and go fix on all files (codes.go, client.go, models.go formatting)

golangci-lint flagged ptrBool as unused. The function had no call sites; the //go:fix inline directive does not exempt it from the unused linter.

The design spec is tracked separately in PR #8200; this PR focuses on the implementation only.

…pagination - Extract getPage helper so resp.Body.Close runs per iteration in ListRoutines and ListRoutineRuns (defer-in-loop leaked FDs). - Preserve the filter query param across pages in ListRoutineRuns; previously page 2+ only carried pageToken and dropped the filter. - Correct dispatch command help text: the service always runs routines asynchronously, so the old 'waits and streams' wording was wrong.

…ine' The extension namespace 'ai.routine' already mounts the extension under 'azd ai routine'. Adding a 'routine' subcommand group on top of that produced the redundant 'azd ai routine routine <cmd>' path. Move --project-endpoint persistent flag and all subcommands directly onto rootCmd so the correct usage is 'azd ai routine <cmd>'.

github-actions · 2026-05-19T09:49:09Z

📋 Prioritization Note

Thanks for the contribution! The linked issue isn't in the current milestone yet.
Review may take a bit longer — reach out to @rajeshkamal5050 or @kristenwomack if you'd like to discuss prioritization.

Copilot

Pull request overview

Adds the initial azd ai routine command surface for the new azure.ai.routines extension, including a Foundry Routines data-plane client, models, endpoint resolution, and CLI commands for routine CRUD/dispatch/run history.

Changes:

Introduces internal/pkg/routines (models + HTTP client) and internal/exterrors structured error helpers.
Implements routine commands (create/update/show/list/delete/enable/disable/dispatch, plus run list) with table/JSON output.
Adds a 5-level project endpoint resolver with unit tests, plus additional unit tests for manifest parsing and flag→wire mapping.

Reviewed changes

Copilot reviewed 24 out of 25 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
cli/azd/extensions/azure.ai.routines/internal/pkg/routines/models.go	Routine/trigger/action/run models + CLI→wire mappings/constants.
cli/azd/extensions/azure.ai.routines/internal/pkg/routines/models_test.go	Unit tests for CLI→wire mappings and default keys.
cli/azd/extensions/azure.ai.routines/internal/pkg/routines/client.go	Data-plane HTTP client for routines operations (CRUD, enable/disable, dispatch, run listing).
cli/azd/extensions/azure.ai.routines/internal/exterrors/errors.go	Structured LocalError/ServiceError helpers and wrappers.
cli/azd/extensions/azure.ai.routines/internal/exterrors/codes.go	Error/operation code constants used across commands.
cli/azd/extensions/azure.ai.routines/internal/cmd/root.go	Extension root command wiring (registers routine subcommands + persistent endpoint flag).
cli/azd/extensions/azure.ai.routines/internal/cmd/endpoint.go	Project endpoint validation + 5-level endpoint resolution cascade (flag/env/config/envvar/error).
cli/azd/extensions/azure.ai.routines/internal/cmd/endpoint_test.go	Unit tests for endpoint validation and cascade precedence.
cli/azd/extensions/azure.ai.routines/internal/cmd/routine.go	(Currently unused) routine subcommand group definition.
cli/azd/extensions/azure.ai.routines/internal/cmd/routine_create.go	`routine create` implementation (flags/manifest support, trigger/action building).
cli/azd/extensions/azure.ai.routines/internal/cmd/routine_create_test.go	Unit tests for create flag→wire trigger/action builders.
cli/azd/extensions/azure.ai.routines/internal/cmd/routine_update.go	`routine update` implementation (GET-then-PUT merge semantics, type-switch guard).
cli/azd/extensions/azure.ai.routines/internal/cmd/routine_show.go	`routine show` implementation (table/JSON).
cli/azd/extensions/azure.ai.routines/internal/cmd/routine_list.go	`routine list` implementation (auto-pagination, table/JSON).
cli/azd/extensions/azure.ai.routines/internal/cmd/routine_delete.go	`routine delete` implementation (confirmation prompt, no-prompt/force guard).
cli/azd/extensions/azure.ai.routines/internal/cmd/routine_enable.go	`routine enable` implementation.
cli/azd/extensions/azure.ai.routines/internal/cmd/routine_disable.go	`routine disable` implementation.
cli/azd/extensions/azure.ai.routines/internal/cmd/routine_dispatch.go	`routine dispatch` implementation (dispatch_async call + output handling).
cli/azd/extensions/azure.ai.routines/internal/cmd/routine_run.go	`routine run list` implementation (run history listing with top/filter).
cli/azd/extensions/azure.ai.routines/internal/cmd/routine_helpers.go	Shared helpers (client creation, JSON printing, tabwriter formatting).
cli/azd/extensions/azure.ai.routines/internal/cmd/routine_manifest.go	Manifest parsing and create/update merge helpers.
cli/azd/extensions/azure.ai.routines/internal/cmd/routine_manifest_test.go	Unit tests for manifest parsing/merge/update-flag application helpers.
cli/azd/extensions/azure.ai.routines/go.mod	Extension module dependencies updated.
cli/azd/extensions/azure.ai.routines/go.sum	Dependency checksum updates.
cli/azd/extensions/azure.ai.routines/cspell.yaml	Adds extension-local cspell words.

The first cut of the Routines client used header/route/field shapes that did not match the TypeSpec being merged in azure-rest-api-specs#42779. Realign the extension with the spec so requests round-trip cleanly: * Preview header renamed from 'x-ms-foundry-features-opt-in' to 'Foundry-Features' (the value 'Routines=V1Preview' was already correct). * Async dispatch route renamed ':dispatch_async' -> ':dispatchAsync'; the action segment is case-sensitive per spec. * Dropped the non-existent ':enable' / ':disable' action routes; enable/disable now read the routine and PUT it back with 'enabled' flipped (idempotent: no-op if already at the target value). * DispatchRoutineRequest wraps a discriminated 'payload' object whose 'type' must match the routine's action type; --conversation-id was removed from dispatch (the spec does not expose it). * Routine.Action is now a single discriminated object (not an 'actions' map keyed by name). * RoutineAction.AgentName -> AgentID; the CLI flag is renamed to --agent-id accordingly. * RoutineTrigger.Cron -> CronExpression to match the TypeSpec field. * PagedRoutine pagination follows the absolute 'nextLink' URL from Azure.Core.Page<Routine> instead of re-deriving a continuation query. * RoutineRun gains the additional fields documented in the spec (phase, trigger_type, attempt_source, action_type, triggered_at, dispatch_id, action_correlation_id, response_id, error_type, error_message); 'run list' now prints phase alongside status. * EventRoutineTrigger fields aligned to the spec: connection_id, owner, repository, actions[]; removed 'assignee'. * DispatchRoutineResponse drops the unused 'status' field. Tests, mock manifests, and the E2E driver were updated to the new contract (--agent-id, agent_id, cron_expression, single 'action'). Note: the live Foundry Routines preview endpoint still returns HTTP 500 on /routines?api-version=v1 even with the correct request shape; that is an upstream service bug tracked separately.

wbreza

Code Review (automated, supplements Copilot pre-review)

TL;DR

First v1 of the azd ai routine command surface in a new extension. Implements create/update/show/list/delete/enable/disable/dispatch plus run list, a 5-level endpoint resolver, structured exterrors, and a Foundry data-plane HTTP client. Code is well-organized and the design is sound. However: the HTTP client has zero tests, enable/disable are GET-then-PUT with no concurrency guard, and the PR description contradicts the actual implementation in two places (enable route, preview header name). Findings below are net-new — they do not duplicate Copilot's 6 pre-review comments (port stripping, --enabled masking manifest, github-issue trigger accepted, boolStr nil→"true", applyUpdateFlags using fmt.Errorf, unused newRoutineCommand), all of which look valid.

🔴 Critical

C1 — PR description contradicts the implementation in two places (internal/pkg/routines/client.go)

PR description claims	Code actually does
"`routine enable/disable` — Calls dedicated `:enable` route (idempotent)"	`setEnabled` does `GetRoutine` → mutate `Enabled` → `PutRoutine`. The code comment itself states: "The Foundry Routines API does not expose a dedicated :enable route; the client mutates the routine resource directly."
"Every request includes `x-ms-foundry-features-opt-in: Routines=V1Preview`"	Header constant is `routinesPreviewHeader = "Foundry-Features"`, so the actual header sent is `Foundry-Features: Routines=V1Preview`.

Two consequences:

Reviewers, docs writers, and future maintainers will form an incorrect mental model from the description.
If the Foundry service actually expects the x-ms-foundry-features-opt-in header name, the preview opt-in is silently broken and the server is ignoring it. Please verify the server-side header name against the Foundry TypeSpec and fix either the code or the description.

🟠 High

H1 — setEnabled has a lost-update race and no If-Match guard (client.go, EnableRoutine/DisableRoutine/setEnabled)

Because enable/disable is GET-then-PUT against the full routine resource, any concurrent writer (another CLI invocation, another user, a workspace tool) that mutates the routine between the GET and the PUT will be silently clobbered. There is no If-Match/ETag conditional request to detect the race.

Fix options: (a) capture ETag from the GET response and send If-Match on the PUT, returning a structured "conflict" error on 412; (b) document this race in user-facing help text and emit a warning when the GET observes Enabled != desired; (c) work with the Foundry team to add a true :enable/:disable endpoint (preferred long-term, matches the PR description).

H2 — HTTP client (internal/pkg/routines/client.go) has zero tests

The data-plane integration boundary — every CRUD/enable/disable/dispatch/list call — has no unit coverage. This means:

Pagination loop termination (nextLink follow in ListRoutines, pageToken in ListRoutineRuns) is unverified.
Same-origin checks on nextLink are unverified.
Error envelope parsing for 401 / 403 / 404 / 409 / 5xx is unverified.
Request body marshaling for trigger/action records is unverified end-to-end.
The client-side Top cap in run-listing is not exercised.

The PR description acknowledges tests are "tracked as follow-up work." Given this is the wire boundary of a Preview API, at minimum httptest.Server-based happy paths for each method plus pagination termination should ship in v1. The current command-level tests only exercise pure helpers (buildTrigger, buildAction, applyUpdateFlags) and don't reach the client.

H3 — routine update may report "no changes" when only --file is used (routine_update.go, routine_manifest.go applyUpdateFlags)

applyUpdateFlags only increments its changed counter when individual CLI flags are set; the --file merge path does not contribute to that counter. The no-op early-return checks changed == 0, so routine update <name> --file manifest.yaml (no other flags) likely takes the no-op branch even though the manifest legitimately changed fields. Either snapshot/diff the routine before vs after the merge, or count merge-introduced changes.

🟡 Medium

M1 — Credential ignores tenant for guest / multi-tenant users (internal/cmd/routine_helpers.go)

newRoutineClient constructs azidentity.NewAzureDeveloperCLICredential(&AzureDeveloperCLICredentialOptions{}) with no TenantID. Per .github/instructions/extensions.instructions.md, extensions must use the user-access tenant (Subscription.UserTenantId, not the resource tenant). For multi-tenant or guest users the default tenant from azd auth may not match the Foundry project's tenant, causing AADSTS errors. Plumb a tenant hint (e.g., from azd context or by extracting from the endpoint/project metadata) into the credential options.

M2 — Pagination has no max-page safety cap (client.go, ListRoutines / ListRoutineRuns)

Both methods loop while NextLink/PageToken is non-empty. A misbehaving or compromised service that returns a self-referential or cycling token will cause an unbounded loop that consumes CPU and memory (each appended page grows the result slice). Add a defensive page-count cap (e.g., 1000) and return a clear "pagination exceeded N pages" error.

M3 — Unbounded io.ReadAll on response bodies (client.go, JSON decode helper)

The decode helper reads the entire response body into memory without io.LimitReader. A misbehaving server returning a multi-GB body will OOM the process. Wrap with a sane cap (e.g., io.LimitReader(body, 32<<20)) before decoding.

M4 — Token scope https://ai.azure.com/.default should be verified (client.go, NewClient)

The Bearer token scope is hardcoded. If the Foundry Routines data plane requires a project-specific or different audience (services.ai.azure.com, a regional audience, or a Foundry-specific resource ID), token acquisition will succeed but the server may reject with 401/403. Verify against the Foundry TypeSpec and add a code comment justifying the chosen scope.

M5 — assert.Error instead of require.Error in negative-path tests (routine_create_test.go)

Several "expect error" tests use assert.Error followed by further dereferences of the result. If the implementation regresses and returns no error, the test logs a failure but continues executing on a nil/zero value, potentially panicking or producing a confusing secondary failure that hides the real cause. Switch to require.Error.

🟢 Low

L1 — routine dispatch --async help text ("Suppress descriptive output; useful for scripting") is ambiguous. Clarify that it prints only the dispatch ID and exits without polling.
L2 — Manifest file-format detection falls back to JSON for empty/unknown extensions, and the error message recommends only JSON, leaving YAML users guessing. Mention both formats in the error.
L3 — routine run subcommand help doesn't mention that run show / run delete are intentionally deferred. Add a one-line note in Long.
L4 — printJSON / table writers write directly to os.Stdout, making future output-assertion tests impossible without monkey-patching. Accept an io.Writer parameter.

Notes (not findings)

This PR lives in an extension with its own go.mod, so core-azd patterns (IoC container, ActionDescriptor, ErrorWithSuggestion) do not apply — the extension uses cobra directly.
new(expr) is valid in Go 1.26+ (see cli/azd/AGENTS.md).
Adding DisallowUnknownFields to the JSON decoder is not recommended — it would break forward-compat with non-breaking Foundry API additions.

Generated review · supplements Copilot inline comments · posted as non-blocking COMMENT.

- endpoint: reject project endpoints with an explicit port so the normalized URL cannot silently strip a non-default port - routine create: only set Enabled from --enabled when the user explicitly passes the flag, so a manifest's enabled value is honored; default to enabled=true if neither source provides one - routine create: explicitly reject --trigger github-issue (deferred for v1) instead of producing an incomplete github_issue trigger - routine_helpers: boolStr now returns "unknown" for a nil pointer to avoid displaying "true" when the field is absent from the service response - routine_manifest: surface applyUpdateFlags user-input errors as exterrors.Validation (CodeInvalidParameter) for consistent CLI error shapes

wbreza

Re-review: commit `1ff54876` (`fix(routines): address PR review feedback`)

Scope: one new commit since my previous review — 6 files, +82/-9.

✅ Copilot's pre-review findings — all addressed correctly

#	Issue	Fix
1	`endpoint.go` silently strips explicit ports	Explicit port now rejected with structured error + 2 new tests
2	`--enabled` default masks manifest value	Uses `cmd.Flags().Changed("enabled")`; default applied only after manifest merge
3	`--trigger github-issue` accepted but incomplete	Explicit rejection in `buildTrigger` with deferred-v1 message
4	`boolStr` returns `"true"` for nil pointer	Now returns `"unknown"`
5	`applyUpdateFlags` uses plain `fmt.Errorf`	All 6 sites converted to `exterrors.Validation(CodeInvalidParameter, …)`

(Copilot's 6th finding — dead newRoutineCommand — was already resolved in earlier commit b7d375b8.)

❌ Still open from my prior review

These weren't in the scope of this commit, just noting so they don't get lost:

C1 — PR description still claims a dedicated :enable route and x-ms-foundry-features-opt-in header. The actual code does GET-then-PUT and sends Foundry-Features. Please reconcile description vs implementation (and verify the preview header name with the Foundry team — if the server expects x-ms-foundry-features-opt-in, opt-in is silently broken).
H1 — Lost-update race in setEnabled (no If-Match/ETag).
H2 — internal/pkg/routines/client.go still has zero tests.
H3 — routine update --file may falsely report "no changes" because file-merge doesn't increment the changed counter.
M1 — AzureDeveloperCLICredential constructed with no TenantID; will break for guest / multi-tenant users.

No new blocking issues introduced by this commit. Nice clean turnaround on the Copilot items.

…ng extensions Other AI extensions (projects, agents, toolboxes, inspector) ship a .golangci.yaml lint config and an AGENTS.md contributor guide. Add both to azure.ai.routines so it follows the same convention, and register the project-specific �xterrors word in cspell.yaml.

The deployed Foundry Routines data plane applies a camelCase property naming policy on the wire (e.g. `cronExpression`, `timeZone`, `agentId`), even though the upstream TypeSpec / OpenAPI document still emits snake_case. With snake_case JSON tags, `routine create` and `update` always failed with errors like: triggers['default'].cronExpression must be provided for schedule routines exactly one of action.agentId or action.agentEndpointId must be provided and routines read back from `show` / `list` would have empty trigger/action fields because the camelCase wire payload did not deserialize into snake_case-tagged Go fields. Switch the JSON tags on `Routine`, `RoutineTrigger`, `RoutineAction`, `RoutineRun`, `PagedRoutineRun`, and `DispatchRoutineResponse` to camelCase so requests/responses round-trip cleanly against the deployed service. YAML tags stay snake_case so user-facing `--file` manifests keep the documented convention. Verified against a live project endpoint: create/list/show now reach the service correctly (residual `InternalServerError` from the backend is unrelated and reproduces from raw curl with the same body).

Use azure-rest-api-specs PR #43186 (Foundry Routines TypeSpec) as the single source of truth for the routines extension, applying every spec change that does not break the currently deployed service, and documenting each deliberate divergence inline and in AGENTS.md. ## Spec alignment * `RoutineRun` and `DispatchRoutineResponse` gain the new `TaskID` field (wire `taskId`); the service already emits it. `dispatch` now prints `Task ID` after `Action Correlation ID`, and JSON output exposes the new field too. * `RoutineTrigger` is restructured to match the spec's `GitHubIssueOpenedRoutineTrigger` shape: dropped `Owner` / `Actions[]`, added `Assignee`. The github trigger is still deferred at the CLI surface, so this is safe. * Inline comments and a new AGENTS.md table call out each divergence the client deliberately keeps to stay compatible with the live service: camelCase wire naming (spec is snake_case), `agentId` field (spec renamed to `agent_name`), `:dispatchAsync` action segment (spec uses `:dispatch_async`), GET+PUT enable/disable fallback (spec adds dedicated routes which still 404), `value`/`nextLink` / `value`/`nextPageToken` paged shapes (spec uses `AgentsPagedResult<T>`), and `github_issue` wire value (spec renamed to `github_issue_opened`). ## CLI bug fix: HTTP/2 stream-reset hang The pipeline now uses a custom `http.Client` with an explicit `ResponseHeaderTimeout` (60s) and `TryTimeout` (30s), and azcore retries are capped at 1. When the Foundry service returns an HTTP/2 RST_STREAM (for example, the schedule-create InternalServerError), the CLI now surfaces a `context deadline exceeded` error within ~40 seconds instead of the previous ~6 minute hang. ## Verified end-to-end against a live Foundry project * timer create / show / list / update / disable / enable / dispatch (with `taskId` round-tripping) / run list / delete all succeed. * schedule create still fails (service-side ISE) but now in under a minute instead of six.

The Foundry data plane currently returns `InternalServerError` for any `PUT /routines/{name}` request whose trigger is `schedule` (the wire value behind the CLI's `--trigger recurring`). The CLI side is fully implemented and verified correct via raw curl, so this is a service-side issue, but it leaves the `recurring` trigger non-functional end-to-end. Take the recurring trigger off the public CLI surface so users do not hit the service hang: * Drop the `--cron` flag from `routine create` and `routine update`. * `--trigger recurring` is now rejected with the same "deferred" shape as `--trigger github-issue`: a clear error pointing the user at `--trigger timer` and explaining that recurring is gated on the Foundry service. * `--trigger` help text and validation messages list only `timer`. The underlying wire model still carries `cron_expression` / `time_zone` and the `schedule` discriminator so re-enabling the trigger when the service is ready is just a CLI flag-wiring change. Unit tests around buildTrigger and applyUpdateFlags are updated accordingly.

…pt in delete, and action-type flag validation

wbreza

Re-review of PR #8241 — feat(routines): implement azd ai routine commands

Reviewed since wbreza's May 20 review. Skipping items already flagged by prior reviews (endpoint port stripping, body.Enabled flag override, buildTrigger deferred github-issue alias, boolStr nil-pointer returning "true", applyUpdateFlags plain fmt.Errorf).

🔴 High — Security / correctness

1. internal/pkg/routines/client.go — Missing Logging.AllowedHeaders on policy.ClientOptions
Sibling azure.ai.agents restricts logged headers to MsCorrelationIdHeader only. The routines client omits the Logging field entirely. Azure SDK default logging will include the Authorization header (bearer token) in pipeline debug logs.
Fix: mirror agents' Logging: policy.LogOptions{AllowedHeaders: []string{azsdk.MsCorrelationIdHeader}}.

2. internal/cmd/routine_dispatch.go — Wrong op code on inner GetRoutine error
When fetching the routine to validate --input, the error is wrapped with exterrors.OpGetRoutine instead of exterrors.OpDispatchRoutine. Telemetry / user-facing error shows get_routine.* when the user ran dispatch. Note the adjacent IsNotFound branch already uses OpDispatchRoutine — the inconsistency is the smell.

3. internal/exterrors/errors.go — IsCancellation misses gRPC codes.Canceled
Routines checks only errors.Is(err, context.Canceled). Agents also checks status.FromError(err).Code() == codes.Canceled. Cancellations bubbling up from gRPC azdext calls are misclassified as internal errors instead of user-driven.
Fix: port the agents implementation verbatim.

🟡 Medium — Test coverage & design

4. internal/pkg/routines/client.go (+390 LOC) — no unit tests
The data-plane client (CRUD, pagination, enable/disable GET+PUT fallback, dispatchAsync, list runs, custom HTTP/2 transport, same-origin pagination guard) has zero tests. PR description defers "unit tests as follow-up", but the manifest parser (low-risk) gets tests while the mission-critical client does not. Suggest at minimum httptest.NewServer-based tests for: 2xx happy paths, 404/409/5xx, nextLink pagination, validateSameOrigin rejection, context cancellation.

5. Command-layer coverage gaps
No tests for routine_update.go (complex flag→file→default merge), routine_delete, routine_enable, routine_disable, routine_dispatch, routine_list, routine_show, routine_helpers. Agents extension test ratio is ~1:1; this PR is ~1:2.2.

6. internal/pkg/routines/client.go — runtime.NewResponseError(resp) leaks raw response body into errors
Used in ~5 spots. Service error bodies pass straight through. CLI commands re-wrap with exterrors.ServiceFromAzure, but the underlying message text is preserved and may surface in logs/output.
Fix: drain body, extract structured fields, return a controlled message — or document that callers must always wrap and never log the raw error.

7. internal/cmd/routine_manifest.go — os.ReadFile errors all classified as CodeFileNotFound
Permission-denied, is-a-directory, etc. surface to the user as "routine manifest file not found" with misleading remediation.
Fix: gate with os.IsNotExist(err); fall back to a generic file-read error otherwise.

8. internal/exterrors/codes.go — No semantic "not found" code
404s become get_routine.404 rather than e.g. routine_not_found. Hurts telemetry classification and stable error-handling contracts.
Fix: add CodeRoutineNotFound and use it in the IsNotFound branches.

🟢 Low / nit

9. internal/cmd/endpoint_test.go — port edge cases under-tested
The recently-added port rejection covers :443 only. Add: non-default ports (:8443), IPv6+port (https://[::1]:443/...), whitespace-only input. The original bug evaded tests because they didn't exercise non-default ports.

10. internal/cmd/routine_update.go — _ = cmd.Flags().MarkHidden(...) (×2)
Silent error discard. Safe in practice, but if err := ...; err != nil { ... } or a brief comment is cleaner.

11. AGENTS.md — "If this extension grows enough… introduce internal/exterrors"
Phrased as a future option, but the package exists in this PR. Rewrite in present tense so future maintainers don't treat it as optional.

12. go.mod — armappservice/v2 v2.3.0 // indirect added
Indirect, likely a cascade from an azd-core dep bump. Worth a quick confirmation that this isn't pulling unwanted weight on the extension.

✅ Verified clean (no action)

URL construction (url.PathEscape / url.QueryEscape)
Pagination validateSameOrigin
HTTP/2 timeout fix (explicit ResponseHeaderTimeout, TLSHandshakeTimeout) is sound, not masking
Wire format alignment (camelCase tags, agentId, :dispatchAsync, GET+PUT enable/disable fallback) matches deployed Foundry service per AGENTS.md
Bearer token scope https://ai.azure.com/.default
Endpoint validation (HTTPS-only, port rejection, .services.ai.azure.com allowlist)
exterrors duplication with agents / toolboxes is acknowledged in AGENTS.md as the established cross-extension pattern — not flagged

microsoft-github-policy-service Bot assigned huimiu May 19, 2026

huimiu added 6 commits May 19, 2026 15:18

fix(routines): remove unused ptrBool helper

32dcd78

golangci-lint flagged ptrBool as unused. The function had no call sites; the //go:fix inline directive does not exempt it from the unused linter.

docs(routines): remove design spec from PR

f7d1284

The design spec is tracked separately in PR #8200; this PR focuses on the implementation only.

huimiu force-pushed the hui/add-routine-command branch from b1c629f to 8e0d723 Compare May 19, 2026 09:00

huimiu marked this pull request as ready for review May 19, 2026 09:46

Copilot AI review requested due to automatic review settings May 19, 2026 09:46

huimiu requested review from JeffreyCA, hemarina, jongio, rajeshkamal5050, tg-msft, vhvb1989, wbreza and weikanglim as code owners May 19, 2026 09:46

Copilot AI reviewed May 19, 2026

View reviewed changes

huimiu changed the title ~~feat(routines): implement azd ai routine commands (v1)~~ feat(routines): implement azd ai routine commands May 19, 2026

fix(routines): clarify comment for github_issue fields in RoutineTrigger

4a9cbe3

wbreza reviewed May 19, 2026

View reviewed changes

wbreza reviewed May 20, 2026

View reviewed changes

JeffreyCA added the ext-agents azure.ai.{agents,connections,inspector,projects,routines,skills,toolboxes} extensions label May 20, 2026

huimiu marked this pull request as draft May 21, 2026 08:03

huimiu changed the title ~~feat(routines): implement azd ai routine commands~~ feat(routines): add azd ai routine commands May 22, 2026

huimiu added 3 commits May 22, 2026 19:25

huimiu force-pushed the hui/add-routine-command branch from 5e76b31 to 0046e61 Compare May 22, 2026 13:39

fix(routines): enforce update-mode manifest merge, env-backed no-prom…

9659512

…pt in delete, and action-type flag validation

huimiu force-pushed the hui/add-routine-command branch from 0046e61 to 9659512 Compare May 22, 2026 13:42

fix(routines): replace unknown word 'misroute' in endpoint comment

53114fa

huimiu marked this pull request as ready for review May 22, 2026 13:49

huimiu requested a review from RickWinter as a code owner May 22, 2026 13:49

huimiu added 2 commits May 22, 2026 21:56

feat(routines): add exterrors unit tests and .gitignore for bin/

57eb23c

style(routines): fix gofmt formatting in test files

e76c689

wbreza reviewed May 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(routines): add azd ai routine commands#8241

feat(routines): add azd ai routine commands#8241
huimiu wants to merge 19 commits into
mainfrom
hui/add-routine-command

huimiu commented May 19, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 19, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

wbreza left a comment

Uh oh!

wbreza left a comment

Uh oh!

wbreza left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

huimiu commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Included

Deferred (re-add when the Foundry service is ready)

Tests

Uh oh!

github-actions Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📋 Prioritization Note

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

wbreza left a comment

Choose a reason for hiding this comment

Code Review (automated, supplements Copilot pre-review)

TL;DR

🔴 Critical

🟠 High

🟡 Medium

🟢 Low

Notes (not findings)

Uh oh!

wbreza left a comment

Choose a reason for hiding this comment

Re-review: commit 1ff54876 (fix(routines): address PR review feedback)

✅ Copilot's pre-review findings — all addressed correctly

❌ Still open from my prior review

Uh oh!

wbreza left a comment

Choose a reason for hiding this comment

Re-review of PR #8241 — feat(routines): implement azd ai routine commands

🔴 High — Security / correctness

🟡 Medium — Test coverage & design

🟢 Low / nit

✅ Verified clean (no action)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

huimiu commented May 19, 2026 •

edited

Loading

github-actions Bot commented May 19, 2026 •

edited

Loading

Re-review: commit `1ff54876` (`fix(routines): address PR review feedback`)