Skip to content

feat: classify run failure error codes and improve error logging#1340

Merged
pranaygp merged 1 commit intomainfrom
pgp/run-failed-schema-vailidation-error
Mar 18, 2026
Merged

feat: classify run failure error codes and improve error logging#1340
pranaygp merged 1 commit intomainfrom
pgp/run-failed-schema-vailidation-error

Conversation

@pranaygp
Copy link
Collaborator

@pranaygp pranaygp commented Mar 12, 2026

Summary

  • Adds error code classification (USER_ERROR, RUNTIME_ERROR) to run_failed events, populating the existing but previously unused errorCode field
  • Improves error logging across world-local queue, world-vercel schema validation, and runtime to be more concise and user-friendly
  • Stacked on top of fix: separate infrastructure vs user code error handling #1339 which structurally separates infrastructure vs user code error handling

Details

Error codes

After #1339's structural separation, the run_failed try/catch only catches:

  • User code errorsUSER_ERROR (throws from workflow functions, propagated step failures)
  • WorkflowRuntimeErrorRUNTIME_ERROR (corrupted event log, missing timestamps — internal bugs)

Infrastructure errors (ECONNRESET, 5xx, schema validation) never produce run_failed at all — they propagate to the queue for retry.

The error code flows through the existing (previously unused) plumbing:
eventData.errorCodeStructuredError.codeWorkflowRunFailedError.cause.code

Note on storage: errorCode is stored inline as a plain DynamoDB attribute on the run entity — it does NOT go through refs/encryption. Only the error object (message + stack) goes through refTrackererrorRef. The errorCode is a sibling field in eventData, extracted and stored separately by the server (events.ts:846).

Web UI

  • RUNTIME_ERROR: amber dot + "Internal Error" tooltip header
  • USER_ERROR / absent (backward compat): red dot + "Error Details" tooltip header
  • Error code shown as a label in the tooltip

Logging improvements

See examples below.

Error log examples (captured from e2e test runs)

Runtime error logs (before → after)

Before:

[Workflow] Error while running workflow {
  workflowRunId: 'wrun_01KKFXW09EHA9M3QXWNEJNC52Z',
  errorName: 'Error',
  errorStack: 'Error: Nested workflow error\n    at errorNested3 ...'
}

After (now includes errorCode):

[Workflow] Error while running workflow {
  workflowRunId: 'wrun_01KKFY7MARP7D16PN69HCMYNVQ',
  errorCode: 'USER_ERROR',
  errorName: 'Error',
  errorStack: 'Error: Nested workflow error\n    at errorNested3 ...'
}

Queue error logs (before → after)

Before (dumped full request body with traceCarrier, runId, stepId, etc.):

[local world] Failed to queue message {
  queueName: '__wkf_step_...',
  text: '"WorkflowAPIError: Injected 5xx"',
  status: 500,
  headers: { ... },
  body: '{"workflowName":"workflow//./workflows/99_e2e//serverError5xxRetryWorkflow",
    "workflowRunId":"wrun_01KKF...",
    "workflowStartedAt":1773282422605,
    "stepId":"step_01KKF...",
    "traceCarrier":{"traceparent":"00-778ab...","baggage":"workflow.run_id=wrun_01KKF..."},
    "requestedAt":"2026-03-12T02:27:02.778Z"}'
}

After (concise, actionable):

[world-local] Queue message failed (attempt 1/3, status 500): "WorkflowAPIError: Injected 5xx" {
  queueName: '__wkf_step_...',
  messageId: 'msg_01KKF...'
}

Schema validation error messages (before → after)

Before (full Zod error dump + CBOR debug context always included):

Schema validation failed for POST /v2/runs/wrun_.../events:

[
  {
    "expected": "object",
    "code": "invalid_type",
    "path": ["run", "error"],
    "message": "Invalid input: expected object, received undefined"
  }
]

Response context: Content-Type: application/cbor, 1589 bytes (CBOR), preview: {
  event: { runId: 'wrun_...', eventId: 'evnt_...', correlationId: 'wrun_...',
    eventType: 'run_failed', eventData: { error: { _ref: 's3rf:team_...', _type: 'RemoteRef' } },
    createdAt: 20... }
}

After (concise issue list, verbose context only when DEBUG env var is set):

Schema validation failed for POST /v2/runs/wrun_.../events:
  run.error: Invalid input: expected object, received undefined

Debug curl reproduction (before → after)

Before: only shown when DEBUG=1 (exact string match)
After: shown when DEBUG is set to any truthy value (consistent with debug package)

Test plan

  • Unit tests for classifyRunError (7 tests, all pass)
  • All 478 core unit tests pass
  • E2E: errorWorkflowNested — asserts error.cause.code === 'USER_ERROR' and runData.error.code === 'USER_ERROR'
  • E2E: errorRetryFatal — asserts error.cause.code === 'USER_ERROR'
  • E2E: infraErrorRetryWorkflow — validates infra errors on run_completed retry via queue (not run_failed)
  • Visual: verify amber vs red badge in web UI

🤖 Generated with Claude Code

@vercel
Copy link
Contributor

vercel bot commented Mar 12, 2026

@changeset-bot
Copy link

changeset-bot bot commented Mar 12, 2026

🦋 Changeset detected

Latest commit: bcad456

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 21 packages
Name Type
@workflow/errors Patch
@workflow/core Patch
@workflow/web Patch
@workflow/world-local Patch
@workflow/world-vercel Patch
@workflow/builders Patch
@workflow/cli Patch
workflow Patch
@workflow/world-postgres Patch
@workflow/next Patch
@workflow/nitro Patch
@workflow/vitest Patch
@workflow/web-shared Patch
@workflow/world-testing Patch
@workflow/astro Patch
@workflow/nest Patch
@workflow/rollup Patch
@workflow/sveltekit Patch
@workflow/vite Patch
@workflow/ai Patch
@workflow/nuxt Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@github-actions
Copy link
Contributor

github-actions bot commented Mar 12, 2026

🧪 E2E Test Results

Some tests failed

Summary

Passed Failed Skipped Total
✅ ▲ Vercel Production 758 0 67 825
✅ 💻 Local Development 782 0 118 900
✅ 📦 Local Production 782 0 118 900
❌ 🐘 Local Postgres 781 1 118 900
✅ 🪟 Windows 72 0 3 75
❌ 🌍 Community Worlds 118 56 15 189
❌ 📋 Other 197 1 27 225
Total 3490 58 466 4014

❌ Failed Tests

🐘 Local Postgres (1 failed)

sveltekit-stable (1 failed):

  • webhookWorkflow | wrun_01KM16JFFMSJNY4A120635Z2N0
🌍 Community Worlds (56 failed)

mongodb (3 failed):

  • hookWorkflow is not resumable via public webhook endpoint | wrun_01KM16J6TZAKBH7QXRBD6V9TN6
  • webhookWorkflow | wrun_01KM16JFFMSJNY4A120635Z2N0
  • concurrent hook token conflict - two workflows cannot use the same hook token simultaneously | wrun_01KM16QXS6R4ZFCW43YM379Q8K

redis (2 failed):

  • hookWorkflow is not resumable via public webhook endpoint | wrun_01KM16J6TZAKBH7QXRBD6V9TN6
  • concurrent hook token conflict - two workflows cannot use the same hook token simultaneously | wrun_01KM16QXS6R4ZFCW43YM379Q8K

turso (51 failed):

  • addTenWorkflow | wrun_01KM16H1BXF1Y1GZCG6ZD2NVGA
  • addTenWorkflow | wrun_01KM16H1BXF1Y1GZCG6ZD2NVGA
  • wellKnownAgentWorkflow (.well-known/agent) | wrun_01KM16J2QM1N4TS0GXAJ9PV4D8
  • should work with react rendering in step
  • promiseAllWorkflow | wrun_01KM16H87FE4JQ029E3JSW7BB3
  • promiseRaceWorkflow | wrun_01KM16HBK6YGKA6D0W7E3WRQEK
  • promiseAnyWorkflow | wrun_01KM16HDSNA0M14FA412ZX8GYH
  • importedStepOnlyWorkflow | wrun_01KM16JCZF90F1DT7GMDC84HV0
  • hookWorkflow | wrun_01KM16HVR3E2W78FDFFK8ZAEH8
  • hookWorkflow is not resumable via public webhook endpoint | wrun_01KM16J6TZAKBH7QXRBD6V9TN6
  • webhookWorkflow | wrun_01KM16JFFMSJNY4A120635Z2N0
  • sleepingWorkflow | wrun_01KM16JPE4XGEK9NF3TWPV4PPQ
  • parallelSleepWorkflow | wrun_01KM16K21DH1YMFD51YKEWTQGR
  • nullByteWorkflow | wrun_01KM16K6GMBBC145MC3NXR8BMG
  • workflowAndStepMetadataWorkflow | wrun_01KM16K8N2RC3204GQG1JB6ASY
  • fetchWorkflow | wrun_01KM16M860J7720ZNM5SPKMNME
  • promiseRaceStressTestWorkflow | wrun_01KM16MBNNJXEFF8K7RN0TZ0MY
  • error handling error propagation workflow errors nested function calls preserve message and stack trace
  • error handling error propagation workflow errors cross-file imports preserve message and stack trace
  • error handling error propagation step errors basic step error preserves message and stack trace
  • error handling error propagation step errors cross-file step error preserves message and function names in stack
  • error handling retry behavior regular Error retries until success
  • error handling retry behavior FatalError fails immediately without retries
  • error handling retry behavior RetryableError respects custom retryAfter delay
  • error handling retry behavior maxRetries=0 disables retries
  • error handling catchability FatalError can be caught and detected with FatalError.is()
  • hookCleanupTestWorkflow - hook token reuse after workflow completion | wrun_01KM16QA2CMVJPNGQVJS202FZW
  • concurrent hook token conflict - two workflows cannot use the same hook token simultaneously | wrun_01KM16QXS6R4ZFCW43YM379Q8K
  • hookDisposeTestWorkflow - hook token reuse after explicit disposal while workflow still running | wrun_01KM16RHZEC82KPP70CVGXQ1SZ
  • stepFunctionPassingWorkflow - step function references can be passed as arguments (without closure vars) | wrun_01KM16S628CP26EW2F3763JQ42
  • stepFunctionWithClosureWorkflow - step function with closure variables passed as argument | wrun_01KM16SEYRAZ6QR4KQR7G83S0Z
  • closureVariableWorkflow - nested step functions with closure variables | wrun_01KM16SMFZ3F0RCPAH4JBCBJ45
  • spawnWorkflowFromStepWorkflow - spawning a child workflow using start() inside a step | wrun_01KM16SPQ4JZ9HYMR89AQ202WF
  • health check (queue-based) - workflow and step endpoints respond to health check messages
  • pathsAliasWorkflow - TypeScript path aliases resolve correctly | wrun_01KM16T6ETD56SFF46GX0N049C
  • Calculator.calculate - static workflow method using static step methods from another class | wrun_01KM16TC55TBEN26Z65WB2JNZE
  • AllInOneService.processNumber - static workflow method using sibling static step methods | wrun_01KM16THVAACYQ0QDCXE0V04DQ
  • ChainableService.processWithThis - static step methods using this to reference the class | wrun_01KM16TRMXHMTPXM03EWFHZEFQ
  • thisSerializationWorkflow - step function invoked with .call() and .apply() | wrun_01KM16TZBTKQJYPSDQR3RT4FJK
  • customSerializationWorkflow - custom class serialization with WORKFLOW_SERIALIZE/WORKFLOW_DESERIALIZE | wrun_01KM16V673WJDTHDP9N625T0Y0
  • instanceMethodStepWorkflow - instance methods with "use step" directive | wrun_01KM16VD6VJ8NFSBR32WCBM88X
  • crossContextSerdeWorkflow - classes defined in step code are deserializable in workflow context | wrun_01KM16VR2K28ADCCWBG3H8JG26
  • stepFunctionAsStartArgWorkflow - step function reference passed as start() argument | wrun_01KM16W02CDN1296Y2SVJETTA0
  • cancelRun - cancelling a running workflow | wrun_01KM16W6TRZ3T8582HN3R6GB59
  • cancelRun via CLI - cancelling a running workflow | wrun_01KM16WGHJZMM5P0CDBA4QGHRD
  • pages router addTenWorkflow via pages router
  • pages router promiseAllWorkflow via pages router
  • pages router sleepingWorkflow via pages router
  • hookWithSleepWorkflow - hook payloads delivered correctly with concurrent sleep | wrun_01KM16WWZD60524EDR69WA9VWK
  • sleepInLoopWorkflow - sleep inside loop with steps actually delays each iteration | wrun_01KM16XKJ6K117NBQNEF591ZNF
  • sleepWithSequentialStepsWorkflow - sequential steps work with concurrent sleep (control) | wrun_01KM16XY29QJHXDN2V770BHE2J
📋 Other (1 failed)

e2e-local-postgres-nest-stable (1 failed):

  • webhookWorkflow | wrun_01KM16JFFMSJNY4A120635Z2N0

Details by Category

✅ ▲ Vercel Production
App Passed Failed Skipped
✅ astro 68 0 7
✅ example 68 0 7
✅ express 68 0 7
✅ fastify 68 0 7
✅ hono 68 0 7
✅ nextjs-turbopack 73 0 2
✅ nextjs-webpack 73 0 2
✅ nitro 68 0 7
✅ nuxt 68 0 7
✅ sveltekit 68 0 7
✅ vite 68 0 7
✅ 💻 Local Development
App Passed Failed Skipped
✅ astro-stable 66 0 9
✅ express-stable 66 0 9
✅ fastify-stable 66 0 9
✅ hono-stable 66 0 9
✅ nextjs-turbopack-canary 55 0 20
✅ nextjs-turbopack-stable 72 0 3
✅ nextjs-webpack-canary 55 0 20
✅ nextjs-webpack-stable 72 0 3
✅ nitro-stable 66 0 9
✅ nuxt-stable 66 0 9
✅ sveltekit-stable 66 0 9
✅ vite-stable 66 0 9
✅ 📦 Local Production
App Passed Failed Skipped
✅ astro-stable 66 0 9
✅ express-stable 66 0 9
✅ fastify-stable 66 0 9
✅ hono-stable 66 0 9
✅ nextjs-turbopack-canary 55 0 20
✅ nextjs-turbopack-stable 72 0 3
✅ nextjs-webpack-canary 55 0 20
✅ nextjs-webpack-stable 72 0 3
✅ nitro-stable 66 0 9
✅ nuxt-stable 66 0 9
✅ sveltekit-stable 66 0 9
✅ vite-stable 66 0 9
❌ 🐘 Local Postgres
App Passed Failed Skipped
✅ astro-stable 66 0 9
✅ express-stable 66 0 9
✅ fastify-stable 66 0 9
✅ hono-stable 66 0 9
✅ nextjs-turbopack-canary 55 0 20
✅ nextjs-turbopack-stable 72 0 3
✅ nextjs-webpack-canary 55 0 20
✅ nextjs-webpack-stable 72 0 3
✅ nitro-stable 66 0 9
✅ nuxt-stable 66 0 9
❌ sveltekit-stable 65 1 9
✅ vite-stable 66 0 9
✅ 🪟 Windows
App Passed Failed Skipped
✅ nextjs-turbopack 72 0 3
❌ 🌍 Community Worlds
App Passed Failed Skipped
✅ mongodb-dev 3 0 2
❌ mongodb 52 3 3
✅ redis-dev 3 0 2
❌ redis 53 2 3
✅ turso-dev 3 0 2
❌ turso 4 51 3
❌ 📋 Other
App Passed Failed Skipped
✅ e2e-local-dev-nest-stable 66 0 9
❌ e2e-local-postgres-nest-stable 65 1 9
✅ e2e-local-prod-nest-stable 66 0 9

📋 View full workflow run


Some E2E test jobs failed:

  • Vercel Prod: success
  • Local Dev: success
  • Local Prod: success
  • Local Postgres: failure
  • Windows: success

Check the workflow run for details.

@pranaygp pranaygp force-pushed the pgp/run-failed-schema-vailidation-error branch from e7068bf to e019085 Compare March 17, 2026 19:35
return 'bg-emerald-500';
case 'failed':
return 'bg-red-500';
return isInfra ? 'bg-amber-500' : 'bg-red-500';
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should just show red no matter whether it's infra or user error

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also don't render the entire error stack trace in the tooltip. That won't even be accessible without decryption when a run is encrypted - so it's enough to just show the error code for now

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — all failures are red now, removed the amber/infra distinction.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — tooltip now only shows the error code (e.g. USER_ERROR), no stack trace or message. The full error data is behind the ref/encryption anyway.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 18, 2026

📊 Benchmark Results

📈 Comparing against baseline from main branch. Green 🟢 = faster, Red 🔺 = slower.

workflow with no steps

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
💻 Local 🥇 Express 0.043s (-1.6%) 1.005s (~) 0.962s 10 1.00x
💻 Local Nitro 0.045s (+4.0%) 1.006s (~) 0.961s 10 1.03x
💻 Local Next.js (Turbopack) 0.051s (+22.7% 🔺) 1.005s (~) 0.954s 10 1.18x
🌐 Redis Next.js (Turbopack) 0.057s (+5.8% 🔺) 1.005s (~) 0.948s 10 1.32x
🐘 Postgres Express 0.058s (-26.2% 🟢) 1.012s (-1.4%) 0.954s 10 1.33x
🐘 Postgres Nitro 0.060s (-3.6%) 1.013s (~) 0.953s 10 1.38x
🐘 Postgres Next.js (Turbopack) 0.066s (+13.0% 🔺) 1.012s (~) 0.946s 10 1.53x
🌐 MongoDB Next.js (Turbopack) 0.100s (~) 1.008s (~) 0.908s 10 2.31x

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Next.js (Turbopack) 0.438s (-9.9% 🟢) 2.091s (-12.8% 🟢) 1.653s 10 1.00x
▲ Vercel Express 0.456s (~) 2.295s (+5.4% 🔺) 1.839s 10 1.04x
▲ Vercel Nitro 0.546s (+12.9% 🔺) 2.577s (+4.0%) 2.032s 10 1.25x

🔍 Observability: Next.js (Turbopack) | Express | Nitro

workflow with 1 step

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
💻 Local 🥇 Next.js (Turbopack) 1.121s (+1.5%) 2.006s (~) 0.885s 10 1.00x
💻 Local Nitro 1.129s (~) 2.006s (~) 0.876s 10 1.01x
💻 Local Express 1.130s (+2.2%) 2.006s (~) 0.876s 10 1.01x
🌐 Redis Next.js (Turbopack) 1.133s (~) 2.007s (~) 0.874s 10 1.01x
🐘 Postgres Next.js (Turbopack) 1.143s (~) 2.011s (~) 0.868s 10 1.02x
🐘 Postgres Express 1.145s (-0.7%) 2.012s (~) 0.867s 10 1.02x
🐘 Postgres Nitro 1.156s (~) 2.013s (~) 0.858s 10 1.03x
🌐 MongoDB Next.js (Turbopack) 1.306s (-0.8%) 2.009s (~) 0.703s 10 1.16x

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Next.js (Turbopack) 2.055s (+0.9%) 3.320s (-6.0% 🟢) 1.265s 10 1.00x
▲ Vercel Nitro 2.119s (~) 3.742s (+0.8%) 1.622s 10 1.03x
▲ Vercel Express 2.285s (+7.2% 🔺) 3.689s (+6.1% 🔺) 1.405s 10 1.11x

🔍 Observability: Next.js (Turbopack) | Nitro | Express

workflow with 10 sequential steps

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
💻 Local 🥇 Next.js (Turbopack) 10.840s (+1.7%) 11.025s (~) 0.184s 3 1.00x
🐘 Postgres Next.js (Turbopack) 10.870s (~) 11.039s (~) 0.169s 3 1.00x
🌐 Redis Next.js (Turbopack) 10.884s (+1.1%) 11.023s (~) 0.139s 3 1.00x
🐘 Postgres Express 10.891s (-0.6%) 11.041s (~) 0.151s 3 1.00x
💻 Local Nitro 10.912s (~) 11.024s (~) 0.113s 3 1.01x
💻 Local Express 10.923s (+1.9%) 11.022s (~) 0.099s 3 1.01x
🐘 Postgres Nitro 10.950s (~) 11.041s (~) 0.091s 3 1.01x
🌐 MongoDB Next.js (Turbopack) 12.284s (~) 13.028s (~) 0.744s 3 1.13x

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Express 17.093s (-1.4%) 18.560s (-1.5%) 1.466s 2 1.00x
▲ Vercel Nitro 17.322s (-4.0%) 19.086s (-6.0% 🟢) 1.764s 2 1.01x
▲ Vercel Next.js (Turbopack) 17.552s (+0.8%) 18.804s (-0.8%) 1.252s 2 1.03x

🔍 Observability: Express | Nitro | Next.js (Turbopack)

workflow with 25 sequential steps

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
🌐 Redis 🥇 Next.js (Turbopack) 26.812s (~) 27.049s (~) 0.237s 3 1.00x
🐘 Postgres Next.js (Turbopack) 27.191s (~) 27.722s (~) 0.531s 3 1.01x
💻 Local Next.js (Turbopack) 27.241s (+2.0%) 28.054s (+3.7%) 0.813s 3 1.02x
🐘 Postgres Nitro 27.249s (~) 28.064s (~) 0.815s 3 1.02x
🐘 Postgres Express 27.293s (~) 28.062s (~) 0.769s 3 1.02x
💻 Local Express 27.507s (+2.0%) 28.053s (+3.7%) 0.546s 3 1.03x
💻 Local Nitro 27.587s (~) 28.054s (~) 0.467s 3 1.03x
🌐 MongoDB Next.js (Turbopack) 30.236s (~) 31.055s (~) 0.818s 2 1.13x

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Nitro 43.975s (-1.2%) 45.567s (-1.0%) 1.592s 2 1.00x
▲ Vercel Express 45.083s (-0.8%) 46.775s (-1.0%) 1.692s 2 1.03x
▲ Vercel Next.js (Turbopack) 46.056s (-1.4%) 48.004s (-0.9%) 1.948s 2 1.05x

🔍 Observability: Nitro | Express | Next.js (Turbopack)

workflow with 50 sequential steps

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
🌐 Redis 🥇 Next.js (Turbopack) 53.371s (~) 54.089s (~) 0.717s 2 1.00x
🐘 Postgres Express 54.255s (~) 55.093s (~) 0.838s 2 1.02x
🐘 Postgres Nitro 54.392s (~) 55.097s (~) 0.705s 2 1.02x
🐘 Postgres Next.js (Turbopack) 54.429s (+0.6%) 54.617s (+0.9%) 0.187s 2 1.02x
💻 Local Next.js (Turbopack) 56.042s (+2.2%) 56.605s (+2.7%) 0.563s 2 1.05x
💻 Local Express 56.641s (+2.1%) 57.104s (+1.8%) 0.463s 2 1.06x
💻 Local Nitro 56.745s (~) 57.106s (~) 0.361s 2 1.06x
🌐 MongoDB Next.js (Turbopack) 60.736s (~) 61.083s (~) 0.347s 2 1.14x

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Express 93.943s (-7.2% 🟢) 96.079s (-6.4% 🟢) 2.136s 1 1.00x
▲ Vercel Nitro 95.428s (-1.8%) 97.707s (-1.9%) 2.279s 1 1.02x
▲ Vercel Next.js (Turbopack) 98.310s (~) 99.739s (~) 1.429s 1 1.05x

🔍 Observability: Express | Nitro | Next.js (Turbopack)

Promise.all with 10 concurrent steps

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
🐘 Postgres 🥇 Express 1.270s (-0.6%) 2.010s (~) 0.740s 15 1.00x
🐘 Postgres Nitro 1.276s (~) 2.012s (~) 0.735s 15 1.00x
🐘 Postgres Next.js (Turbopack) 1.289s (+1.0%) 2.014s (~) 0.725s 15 1.01x
🌐 Redis Next.js (Turbopack) 1.391s (~) 2.006s (~) 0.615s 15 1.09x
💻 Local Nitro 1.514s (+1.2%) 2.005s (~) 0.491s 15 1.19x
💻 Local Express 1.532s (+2.2%) 2.006s (~) 0.474s 15 1.21x
💻 Local Next.js (Turbopack) 1.562s (+4.4%) 2.006s (~) 0.445s 15 1.23x
🌐 MongoDB Next.js (Turbopack) 2.150s (-0.7%) 3.010s (~) 0.860s 10 1.69x

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Express 2.410s (+6.6% 🔺) 3.796s (+0.8%) 1.387s 9 1.00x
▲ Vercel Next.js (Turbopack) 2.485s (-7.5% 🟢) 3.580s (-10.8% 🟢) 1.095s 9 1.03x
▲ Vercel Nitro 3.009s (+23.3% 🔺) 4.346s (+5.8% 🔺) 1.337s 7 1.25x

🔍 Observability: Express | Next.js (Turbopack) | Nitro

Promise.all with 25 concurrent steps

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
🐘 Postgres 🥇 Nitro 2.440s (-0.7%) 3.013s (~) 0.572s 10 1.00x
🐘 Postgres Express 2.445s (~) 3.012s (~) 0.567s 10 1.00x
🐘 Postgres Next.js (Turbopack) 2.557s (+2.2%) 3.018s (~) 0.461s 10 1.05x
🌐 Redis Next.js (Turbopack) 2.580s (~) 3.007s (~) 0.427s 10 1.06x
💻 Local Nitro 2.938s (+1.9%) 3.342s (+11.1% 🔺) 0.403s 9 1.20x
💻 Local Express 3.025s (+14.0% 🔺) 3.453s (+14.8% 🔺) 0.428s 9 1.24x
💻 Local Next.js (Turbopack) 3.091s (+12.4% 🔺) 3.885s (+25.0% 🔺) 0.794s 8 1.27x
🌐 MongoDB Next.js (Turbopack) 4.744s (+2.3%) 5.179s (~) 0.435s 6 1.94x

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Nitro 2.704s (+4.8%) 4.152s (+5.0% 🔺) 1.448s 8 1.00x
▲ Vercel Express 2.730s (+11.7% 🔺) 3.821s (+2.4%) 1.091s 8 1.01x
▲ Vercel Next.js (Turbopack) 3.040s (+3.3%) 4.267s (-5.9% 🟢) 1.227s 8 1.12x

🔍 Observability: Nitro | Express | Next.js (Turbopack)

Promise.all with 50 concurrent steps

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
🐘 Postgres 🥇 Express 3.587s (~) 4.014s (~) 0.427s 8 1.00x
🐘 Postgres Nitro 3.615s (+0.5%) 4.015s (~) 0.401s 8 1.01x
🐘 Postgres Next.js (Turbopack) 3.906s (+3.6%) 4.147s (+3.3%) 0.241s 8 1.09x
🌐 Redis Next.js (Turbopack) 4.085s (-2.4%) 4.725s (-5.7% 🟢) 0.640s 7 1.14x
💻 Local Next.js (Turbopack) 7.824s (+27.2% 🔺) 8.022s (+17.7% 🔺) 0.198s 4 2.18x
💻 Local Express 8.129s (+20.2% 🔺) 8.523s (+13.4% 🔺) 0.394s 4 2.27x
💻 Local Nitro 8.769s (+10.0% 🔺) 9.275s (+5.7% 🔺) 0.505s 4 2.44x
🌐 MongoDB Next.js (Turbopack) 9.831s (~) 10.351s (~) 0.519s 3 2.74x

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Nitro 3.139s (-1.8%) 4.486s (-9.2% 🟢) 1.347s 7 1.00x
▲ Vercel Express 3.337s (+13.7% 🔺) 4.981s (+14.8% 🔺) 1.645s 7 1.06x
▲ Vercel Next.js (Turbopack) 3.888s (-3.8%) 5.096s (-2.2%) 1.208s 6 1.24x

🔍 Observability: Nitro | Express | Next.js (Turbopack)

Promise.race with 10 concurrent steps

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
🐘 Postgres 🥇 Express 1.270s (~) 2.011s (~) 0.740s 15 1.00x
🐘 Postgres Nitro 1.274s (+0.6%) 2.011s (~) 0.737s 15 1.00x
🐘 Postgres Next.js (Turbopack) 1.282s (+0.7%) 2.012s (~) 0.730s 15 1.01x
🌐 Redis Next.js (Turbopack) 1.289s (-3.7%) 2.006s (~) 0.718s 15 1.01x
💻 Local Express 1.507s (~) 2.005s (~) 0.498s 15 1.19x
💻 Local Nitro 1.523s (+1.2%) 2.007s (~) 0.483s 15 1.20x
💻 Local Next.js (Turbopack) 1.580s (+3.2%) 2.072s (+3.3%) 0.492s 15 1.24x
🌐 MongoDB Next.js (Turbopack) 2.156s (-1.7%) 3.009s (~) 0.853s 10 1.70x

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Next.js (Turbopack) 2.209s (-10.0% 🟢) 3.356s (-11.2% 🟢) 1.147s 9 1.00x
▲ Vercel Nitro 2.265s (+0.7%) 3.871s (+4.4%) 1.606s 8 1.03x
▲ Vercel Express 3.931s (+48.8% 🔺) 5.630s (+38.8% 🔺) 1.699s 6 1.78x

🔍 Observability: Next.js (Turbopack) | Nitro | Express

Promise.race with 25 concurrent steps

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
🐘 Postgres 🥇 Express 2.446s (~) 3.011s (~) 0.564s 10 1.00x
🐘 Postgres Nitro 2.469s (~) 3.011s (~) 0.542s 10 1.01x
🐘 Postgres Next.js (Turbopack) 2.532s (~) 3.014s (-3.1%) 0.481s 10 1.04x
🌐 Redis Next.js (Turbopack) 2.579s (-0.8%) 3.007s (~) 0.428s 10 1.05x
💻 Local Next.js (Turbopack) 2.842s (+3.9%) 3.676s (+22.2% 🔺) 0.834s 9 1.16x
💻 Local Express 3.006s (+5.9% 🔺) 3.564s (+14.6% 🔺) 0.558s 9 1.23x
💻 Local Nitro 3.059s (-1.0%) 3.760s (~) 0.700s 8 1.25x
🌐 MongoDB Next.js (Turbopack) 4.698s (+2.4%) 5.177s (~) 0.479s 6 1.92x

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Next.js (Turbopack) 2.715s (-33.2% 🟢) 3.833s (-27.1% 🟢) 1.117s 8 1.00x
▲ Vercel Express 2.796s (+4.1%) 4.021s (-4.6%) 1.225s 8 1.03x
▲ Vercel Nitro 4.002s (~) 5.417s (~) 1.415s 6 1.47x

🔍 Observability: Next.js (Turbopack) | Express | Nitro

Promise.race with 50 concurrent steps

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
🐘 Postgres 🥇 Nitro 3.583s (-0.5%) 4.014s (~) 0.431s 8 1.00x
🐘 Postgres Express 3.588s (-0.5%) 4.016s (~) 0.428s 8 1.00x
🐘 Postgres Next.js (Turbopack) 3.791s (-1.8%) 4.015s (-3.0%) 0.224s 8 1.06x
🌐 Redis Next.js (Turbopack) 4.182s (+1.3%) 5.011s (~) 0.829s 6 1.17x
💻 Local Next.js (Turbopack) 8.369s (+22.4% 🔺) 8.771s (+16.7% 🔺) 0.403s 4 2.34x
💻 Local Express 8.506s (+8.1% 🔺) 9.025s (+9.1% 🔺) 0.519s 4 2.37x
💻 Local Nitro 9.224s (+4.6%) 9.774s (+2.6%) 0.550s 4 2.57x
🌐 MongoDB Next.js (Turbopack) 10.023s (+0.5%) 10.349s (~) 0.326s 3 2.80x

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Nitro 2.732s (-5.8% 🟢) 4.067s (-8.4% 🟢) 1.334s 8 1.00x
▲ Vercel Express 2.808s (-8.7% 🟢) 3.854s (-18.6% 🟢) 1.046s 8 1.03x
▲ Vercel Next.js (Turbopack) 3.334s (-6.8% 🟢) 4.637s (-9.1% 🟢) 1.303s 7 1.22x

🔍 Observability: Nitro | Express | Next.js (Turbopack)

Stream Benchmarks (includes TTFB metrics)
workflow with stream

💻 Local Development

World Framework Workflow Time TTFB Slurp Wall Time Overhead Samples vs Fastest
💻 Local 🥇 Next.js (Turbopack) 0.171s (+20.7% 🔺) 1.002s (~) 0.012s (+35.6% 🔺) 1.018s (~) 0.848s 10 1.00x
🌐 Redis Next.js (Turbopack) 0.172s (-3.4%) 1.000s (~) 0.002s (+15.4% 🔺) 1.008s (~) 0.835s 10 1.01x
💻 Local Express 0.199s (+35.9% 🔺) 1.003s (~) 0.011s (-1.8%) 1.017s (~) 0.817s 10 1.17x
💻 Local Nitro 0.200s (-1.3%) 1.003s (~) 0.012s (~) 1.018s (~) 0.818s 10 1.17x
🐘 Postgres Nitro 0.220s (+4.2%) 0.996s (~) 0.002s (~) 1.012s (~) 0.793s 10 1.29x
🐘 Postgres Express 0.221s (-1.3%) 0.992s (~) 0.002s (+21.4% 🔺) 1.013s (~) 0.792s 10 1.29x
🐘 Postgres Next.js (Turbopack) 0.241s (+11.2% 🔺) 1.003s (~) 0.002s (+7.1% 🔺) 1.019s (+0.5%) 0.778s 10 1.41x
🌐 MongoDB Next.js (Turbopack) 0.510s (+5.7% 🔺) 0.937s (-3.1%) 0.001s (~) 1.009s (~) 0.499s 10 2.99x

▲ Production (Vercel)

World Framework Workflow Time TTFB Slurp Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Nitro 1.662s (-1.5%) 2.422s (-4.6%) 0.005s (-34.2% 🟢) 15.088s (+377.3% 🔺) 13.426s 10 1.00x
▲ Vercel Express 1.814s (+2.3%) 2.919s (+2.6%) 0.006s (+42.9% 🔺) 15.475s (+340.0% 🔺) 13.661s 10 1.09x
▲ Vercel Next.js (Turbopack) 1.837s (+14.7% 🔺) 2.956s (+5.4% 🔺) 0.005s (+20.0% 🔺) 3.522s (+5.1% 🔺) 1.685s 10 1.11x

🔍 Observability: Nitro | Express | Next.js (Turbopack)

Summary

Fastest Framework by World

Winner determined by most benchmark wins

World 🥇 Fastest Framework Wins
💻 Local Next.js (Turbopack) 8/12
🐘 Postgres Express 6/12
▲ Vercel Nitro 5/12
Fastest World by Framework

Winner determined by most benchmark wins

Framework 🥇 Fastest World Wins
Express 🐘 Postgres 7/12
Next.js (Turbopack) 💻 Local 4/12
Nitro 🐘 Postgres 6/12
Column Definitions
  • Workflow Time: Runtime reported by workflow (completedAt - createdAt) - primary metric
  • TTFB: Time to First Byte - time from workflow start until first stream byte received (stream benchmarks only)
  • Slurp: Time from first byte to complete stream consumption (stream benchmarks only)
  • Wall Time: Total testbench time (trigger workflow + poll for result)
  • Overhead: Testbench overhead (Wall Time - Workflow Time)
  • Samples: Number of benchmark iterations run
  • vs Fastest: How much slower compared to the fastest configuration for this benchmark

Worlds:

  • 💻 Local: In-memory filesystem world (local development)
  • 🐘 Postgres: PostgreSQL database world (local development)
  • ▲ Vercel: Vercel production/preview deployment
  • 🌐 Turso: Community world (local development)
  • 🌐 MongoDB: Community world (local development)
  • 🌐 Redis: Community world (local development)
  • 🌐 Jazz: Community world (local development)

📋 View full workflow run

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds run failure classification via standardized error codes and threads those codes through runtime events into the UI, while also tightening up debug/error logging in world implementations.

Changes:

  • Introduces RUN_ERROR_CODES (USER_ERROR, RUNTIME_ERROR) and classifyRunError() to populate run_failed.eventData.errorCode.
  • Updates Web UI status badge to surface/copy error codes (when present) instead of raw error details.
  • Adjusts world-local queue error logging and world-vercel schema-validation error messaging / debug gating.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
packages/world-vercel/src/utils.ts Tweaks DEBUG behavior and improves schema validation error formatting (with optional debug context).
packages/world-local/src/queue.ts Simplifies queue failure logs and standardizes log prefixes/metadata.
packages/web/app/components/display-utils/status-badge.tsx Shows a tooltip for failed statuses that exposes/copies StructuredError.code.
packages/errors/src/index.ts Re-exports new error code constants/types from error-codes.
packages/errors/src/error-codes.ts Defines canonical run error codes and exported union type.
packages/core/src/runtime.ts Classifies caught workflow errors and includes errorCode in run_failed events.
packages/core/src/classify-error.ts Adds helper to classify errors into run error codes.
packages/core/src/classify-error.test.ts Unit tests for classifyRunError.
packages/core/e2e/e2e.test.ts Extends e2e assertions to verify error codes are present in surfaced errors and CLI output.
.changeset/classify-run-error-codes.md Publishes patch bumps and documents the feature/logging changes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines 360 to 386
@@ -374,7 +381,7 @@ export function workflowEntrypoint(
message: errorMessage,
stack: errorStack,
},
// TODO: include error codes when we define them
errorCode,
},
},
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — the setup failure path now includes errorCode: RUN_ERROR_CODES.RUNTIME_ERROR in the run_failed event data. Both run_failed emission sites are now consistent.

Copy link
Member

@VaguelySerious VaguelySerious left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass by Claude:

1. DEBUG truthiness change leaks Authorization Bearer tokens in logs

The DEBUG check changed from process.env.DEBUG === '1' to process.env.DEBUG (truthy). The curl-reproduction debug block at line 302 stringifies all request headers, including Authorization: Bearer <token>. The debug npm package (a dependency of @workflow/core) commonly sets DEBUG=workflow:* or DEBUG=*, which is now truthy and triggers token emission to stderr. This flows into Vercel deployment logs, Datadog log drains, and CI output.

Suggestion: Either (a) revert to process.env.DEBUG === '1' for the curl block specifically, or (b) redact the Authorization header before logging. The schema-validation debug context at line 369 is less sensitive and can use truthy.

// Option (a): revert strict check for curl block
if (process.env.DEBUG === '1') {

// Option (b): redact sensitive headers
const safeHeaders = Array.from(headers.entries())
  .filter(([key]) => key.toLowerCase() !== 'authorization')
  .map(([key, value]) => `-H "${key}: ${value}"`)
  .join(' ');

2. UI regression: StatusBadge tooltip lost for runs without error code

The old ErrorStatusBadge showed a tooltip with the full error message whenever status === 'failed' && context?.error. The new ErrorCodeBadge only renders when getErrorCode(context?.error) returns a string. This means:

  • Historical failed runs pre-dating this PR have no .code on their StructuredError -- tooltip silently disappears
  • Step failures never populate errorCode (only run_failed events do) -- no tooltip for failed steps
  • The error message itself is no longer shown anywhere in the tooltip -- only the code string like USER_ERROR

Suggestion: Keep ErrorCodeBadge for runs with a code, but restore a fallback that shows the error message when no code is present. Consider a two-tier tooltip: error code (if present) + error message.


3. errorCode accepts arbitrary strings in wire schemas

Both WorkflowRunWireBaseSchema and RunFailedEventSchema define errorCode: z.string().optional(). The domain defines only two valid codes (USER_ERROR, RUNTIME_ERROR), but validation doesn't enforce this at the boundary.

Suggestion: Use z.enum(['USER_ERROR', 'RUNTIME_ERROR']).optional() on the write path (event creation). On the read path (wire schema), z.string().optional() is acceptable for forward-compatibility with newer server versions.


4. Misleading retry denominator in queue log message

attempt ${attempt + 1}/${attempt + 1 + defaultRetriesLeft} works by arithmetic coincidence because defaultRetriesLeft is decremented before this line. If retry logic changes (e.g., defaultRetriesLeft is incremented on timeout at line 165), the denominator inflates.

Suggestion: Capture total before the loop:

const maxAttempts = defaultRetriesLeft + 1;
// then in the loop:
`attempt ${attempt + 1}/${maxAttempts}`

5. Stale TODO in StructuredError.code definition

Comment says // TODO: currently unused. make this an enum maybe but this PR actively populates it.

Suggestion: Update to reflect current state: // Populated with RunErrorCode values (USER_ERROR, RUNTIME_ERROR) for run_failed events


6. Missing OTEL span attribute for errorCode

The errorCode is logged to runtimeLogger but NOT set as an OTEL span attribute. Span attributes include WorkflowRunStatus, WorkflowErrorName, WorkflowErrorMessage, and ErrorType but omit error classification. Datadog traces cannot filter by USER_ERROR vs RUNTIME_ERROR.

Suggestion: Add Attribute.WorkflowErrorCode(errorCode) to both span attribute blocks (lines 406-409 and 417-421).

Copy link
Collaborator Author

@pranaygp pranaygp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed all actionable items:

  1. Auth header leak — Fixed. Authorization header is now filtered out of the curl debug output.

  2. UI tooltip regression — Intentional. The old error message tooltip won't work with encryption (error is behind refs), so we intentionally show only the error code. No tooltip for runs without a code is the expected behavior for backward compat.

  3. z.string() vs z.enum — Keeping z.string(). The domain types enforce valid codes at the TypeScript level. Using z.enum on the wire would break forward-compat when new error codes are added.

  4. Misleading retry denominator — Fixed. Captured maxAttempts before the loop.

  5. Stale TODO — Fixed. Updated comment to reflect current state.

  6. Missing OTEL span attribute — Fixed. Added Attribute.WorkflowErrorCode(errorCode) to both span attribute blocks.

Copy link
Member

@VaguelySerious VaguelySerious left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, left a few pieces of human feedback.

Also Claude says Steps can fail but have no classification. A step failure that bubbles up to "run_failed" gets a code only at the run level. Document this as an intentional Phase 1 scope limitation if step-level classification is planned.

const handleCopy = async (e: React.MouseEvent) => {
e.stopPropagation();
await navigator.clipboard.writeText(errorMessage);
await navigator.clipboard.writeText(errorCode);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Clipboard API throws on unfocused pages, non-HTTPS contexts, or permission denial. The existing copyable-text.tsx:28-34 correctly wraps this in try/catch - this component does not. Could fix.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — wrapped in try/catch.

@@ -90,12 +90,16 @@ export function serializeError<T extends { error?: StructuredError }>(
* status), but the transformation preserves all other fields correctly.
*/
export function deserializeError<T extends Record<string, any>>(obj: any): T {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes here are missing unit tests but fine IMO

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack — the errorCode merging in deserializeError is covered implicitly by the e2e tests that assert error.cause.code === 'USER_ERROR' end-to-end.

expect(WorkflowRunFailedError.is(error)).toBe(true);
assert(WorkflowRunFailedError.is(error));
expect(error.cause.message).toContain('Nested workflow error');
expect(error.cause.code).toBe('USER_ERROR');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add one e2e test that actually causes + asserts a RUNTIME_ERROR

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Triggering RUNTIME_ERROR in e2e requires corrupting the event log (e.g., injecting an unexpected event type for a step). This is fragile and environment-dependent. We have unit test coverage for WorkflowRuntimeError classification in classify-error.test.ts and for the error itself in step.test.ts. Could add an e2e in a follow-up with a dedicated fault injection mechanism.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

human: I tried this but not possible without chaos testing or a bigger change out of scope of this PR. tl;dr we can't easily inject failures into runs (we do on steps by injecting 500s into the world) but for workflow/run code - we would either need to expose things into the VM to allow it to inject that (i.e. changing runtime code just for a fault injection test) - or another ideas is we need to use a proxy in front of workflow-server and queue to inject these failures

at that point I'm thinking we just do this in the chaos testing @TooTallNate is setting up and we can have validation that it's working once that's up

* These are populated in the `errorCode` field of `run_failed` events
* and flow through to `StructuredError.code` on the run entity.
*/
export const RUN_ERROR_CODES = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If these are also user-facing, should add to docs

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call — will add docs in a follow-up. These are user-facing via WorkflowRunFailedError.cause.code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added docs in #1445 — documents error codes in the errors & retries guide.

- Add RUN_ERROR_CODES (USER_ERROR, RUNTIME_ERROR) to @workflow/errors
- Populate errorCode in run_failed events via classifyRunError()
- Update web UI StatusBadge to show amber dot for infrastructure errors
- Improve world-local queue error logging (concise, no body dump)
- Improve schema validation error messages (concise, verbose behind DEBUG)
- Add e2e tests for error code flow and infrastructure error retry

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@pranaygp
Copy link
Collaborator Author

pranaygp commented Mar 18, 2026

human: Merging this so I can unblock the next PR. good call on the docs though - I've started another PR for that rather than this one so I don't have to wait for CI again for a docs change

Copy link
Collaborator Author

@pranaygp pranaygp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs PR for error codes: #1445

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants