fix: ship handler.mjs as the sole Lambda entry point#784
Conversation
AWS Lambda's CJS resolver picks `dist/handler.js` before `dist/handler.mjs` when the handler string is `node_modules/datadog-lambda-js/dist/handler.handler`, causing ESM user functions to fail with ERR_REQUIRE_ESM (or the surface "Cannot find module" variant when _tryRequireSync swallows it). Today `dist/handler.js` ships as a verbatim copy of `dist/handler.cjs` (via `cp ./dist/handler.cjs ./dist/handler.js` in the publish scripts), so it can never load ESM user code. This replaces the copy with a real `src/handler.js` shim that: - Detects ESM user code via the task root `package.json` `type` field or a `.mjs` `DD_LAMBDA_HANDLER` / `_HANDLER`. - For CJS user code: synchronously delegates to `handler.cjs`, preserving the prior fast path (no cold-start regression). - For ESM user code: exposes an async handler that lazily `import()`s `handler.mjs` on the first invocation. `handler.mjs`'s async `load()` transparently handles both CJS and ESM user modules. `scripts/update_dist_version.sh` already copies `src/handler.*` into `dist/`, so the shim ships automatically; the explicit `cp ./dist/handler.cjs ./dist/handler.js` lines in `publish_npm.sh` and `publish_prod.sh` are no longer needed and are removed. Refs: #782 Refs: SLES-2888, SLES-2895, SVLS-9238
|
The Dockerfile copies `dist/` wholesale into the layer, so the new `dist/handler.js` shim would get included and Lambda's bootstrap would resolve `handler.handler` to it. Inside the layer that goes through the shim's CJS branch -> handler.cjs -> _tryRequireSync, which can't load `.mjs` user modules (this regressed the `esm_node*` integration tests). The layer has never shipped a `handler.js`; Lambda's resolver falls through to `handler.mjs`, whose async `load()` handles both CJS and ESM user code. Drop the shim from the layer to preserve that behavior. The shim is still useful in the npm tarball, where the publish scripts used to copy `handler.cjs` to `handler.js`.
|
First push regressed the Inside the layer, the shim's The layer has never shipped a |
duncanista
left a comment
There was a problem hiding this comment.
Thanks for tackling this — the bug is real and the runtime-branching shim is a reasonable approach. A few thoughts on the structure and on coverage that I think are worth addressing before merge. Nothing blocking the direction, just some cleanup that would make the file layout easier to reason about for the next person who touches it.
- Rename `userIsEsm()` to `shouldUseEsmHandler()`. The function returns a
routing decision based on heuristics (env regex + package.json sniff),
not an authoritative fact about the user's code, so the new name reads
more honestly at the call site.
- Inline the former `src/handler.cjs` body into the CJS branch of the
shim and delete `src/handler.cjs`. Lambda's resolver doesn't auto-resolve
`.cjs`, so handler.cjs was only ever reachable via this shim's
`require("./handler.cjs")`. Keeping it as a separate file made the load
graph confusing — future readers grepping for "what loads handler.cjs?"
would find one caller in the same directory. The heavy `require()`s
remain scoped to the CJS branch so ESM users don't pay the cost of
pulling in the full tracer at require-time.
- Add `src/handler.spec.ts` covering shouldUseEsmHandler's four-case
matrix (.mjs in _HANDLER / DD_LAMBDA_HANDLER, package.json type=module,
package.json without type field, package.json missing, package.json
unparseable). Detection is the load-bearing piece — a regression there
means either ESM users keep seeing ERR_REQUIRE_ESM or CJS users pay an
unnecessary async-import cold-start cost.
- Tighten `scripts/update_dist_version.sh`'s `cp src/handler.*` to list
the runtime artifacts explicitly. The broader glob also picked up
`src/handler.spec.ts`, which is a test fixture, not a runtime file —
shipping it would also be silently inconsistent with the Dockerfile's
hard-coded `rm /nodejs/.../handler.js`.
- Expand the Dockerfile comment to point at
`scripts/update_dist_version.sh` as the source of the file being
removed, so a future change to how `src/handler.js` is copied into
`dist/` is visibly tied to the layer-side `rm`.
|
I think I wrote the original version of the handler.cjs/esm module split a few years ago, so maybe this context would help. So, much of this logic was due to whether the node runtime supports ESM modules and top level await, (at the time we supported node 12 which didn't). Top level await is only supported in ESM, and is needed when dynamic importing ESM modules so we can receive the handler function instead of a promise, and export the result. This isn't possible in CJS, while you can import ESM modules dynamically, they end up as promises and if you want to await that promise you have to do it within an async function. ESM doesn't have any problems requiring CJS modules dynamically though. So, when the customer is using the layer, and a runtime >= 14, it makes sense to just use the ESM handler since it should handle all cases in theory. Since older runtimes are completely deprecated now, we should only bundle the ESM handler into the layer. I haven't looked at the support case to check if this reasoning was solid, just what I remember verifying at the time, and I know that was the previous attempted approach, but I'm confused how it could have broken CJS users. The second case is when installing datadog-lambda-js via npm. Since we don't know which module format the customer is using, we have to package both versions. I believe I did some experimentation to find which files would be auto-imported based on their file extension. I think you had to specify the .mjs extension for it to work for esm. Typically though, a customer using the npm module could just import the datadog-lambda-js library and wrap manually if they ran into an issue. |
After feedback from Darcy Rayner (original handler.cjs/mjs author) and
re-evaluating the cold-start trade-offs, the shim is the wrong shape.
The shim's CJS branch saved ~10-30ms of cold-start init for CJS users
on the npm-redirect path, but its ESM branch moved 300-800ms of
tracer/handler load work from Lambda's init phase into the first
invocation. That's worse in three ways for ESM users (the cohort this
PR exists to fix):
1. First-invoke timeout risk on functions with tight timeouts —
the loading work now eats into the invocation's time budget.
2. Provisioned concurrency is wasted — work deferred past init isn't
pre-warmed, so PC customers pay for warm capacity and still hit
cold-start latency on the first real request.
3. Lambda SnapStart is wasted for the same reason — snapshot is
taken after init, and the shim's lazy `import()` runs post-snapshot.
handler.mjs's async `load()` already handles both CJS and ESM user
modules transparently. Lambda's bootstrap resolves `dist/handler.handler`
to handler.mjs once `.js` is absent, so removing the shim is sufficient
and matches the behavior the published layer has always had.
End state:
- Sole Lambda entry point everywhere: `handler.mjs`.
- The npm tarball and the layer ship the same `dist/` (one handler file).
- No detection heuristics, no shim, no `handler.cjs` (already gone).
Aligns with:
- Darcy Rayner's recommendation on the PR thread to "only bundle the
ESM handler" since all supported runtimes (Node 18+) support top-level
await in ESM.
- The customer's own suggested fix in
#782.
- SLES-2888's confirmed-working internal workaround
(`rm dist/handler.js` post-install).
- The intent of the earlier PR #697.
|
@DarcyRaynerDD thanks for the historical context — it's exactly what was missing. You're right on both counts. I switched the PR to do what you suggested in commit Re your "I'm confused how it could have broken CJS users" point — I went back through this carefully and I don't think it would have. Lambda's bootstrap resolves I had originally added a runtime-branching shim to avoid making CJS users on the npm-redirect path pay an extra ~10-30ms of init time (sync Description updated. Welcome any further notes. |
There was a problem hiding this comment.
Pull request overview
Removes the CommonJS handler artifact path so AWS Lambda resolves the Datadog handler redirect to the ESM handler.mjs, avoiding ESM user-handler load failures caused by dist/handler.js being preferred by the bootstrap resolver.
Changes:
- Delete
src/handler.cjsso there’s no longer a CJS handler source to ship/copy. - Update
scripts/update_dist_version.shto copy onlysrc/handler.mjsintodist/. - Remove publishing-time steps that copied
dist/handler.cjstodist/handler.js.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
src/handler.cjs |
Removes the CJS handler implementation so it can’t be shipped as an entry point. |
scripts/update_dist_version.sh |
Restricts post-build copying to handler.mjs as the intended Lambda entry point. |
scripts/publish_prod.sh |
Stops injecting dist/handler.js during prod publishing. |
.gitlab/scripts/publish_npm.sh |
Stops injecting dist/handler.js during CI npm publishing. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
According to 🤖
Deleting
It's also not reachable through the package's public entry point: Since this PR is what orphaned it, it'd be clean to remove the three functions ( |
…rtifacts Two follow-ups from PR #784 review: 1. Remove `loadSync`, `_loadUserAppSync`, `_tryRequireSync`. Deleting `src/handler.cjs` in the previous commit removed the only consumers of the synchronous loader chain. `loadSync` was never re-exported from the public `src/index.ts` entry point, no tests reference it, and the only internal import (`src/handler.mjs`) uses the async `load()`. The three functions and the `loadSync` re-export in `src/runtime/index.ts` are now dead code, so drop them. 2. Defensive cleanup in `scripts/update_dist_version.sh`: explicitly `rm -f dist/handler.js dist/handler.cjs` before copying handler.mjs. `tsc` does not clean stale files in `dist/` between incremental builds, so a leftover `handler.js` from a prior build of an earlier ref could resurface and silently re-introduce the resolver bug this PR is fixing. The publish scripts already do `rm -rf ./dist` before `yarn build`, so this is purely a guard for local dev builds.
|
@lucaspimentel good catch — verified and addressed in
Removed all three functions from Net diff: -105 lines from |
What does this PR do?
Fixes ESM Lambda failures with the Datadog handler redirect by removing the
dist/handler.jsartifact from the published npm tarball. Lambda's bootstrap then resolvesnode_modules/datadog-lambda-js/dist/handler.handlertodist/handler.mjs, whose asyncload()already handles both CJS and ESM user modules. This aligns the npm package with the layer's behavior (which has always loadedhandler.mjsdirectly).Motivation
AWS Lambda's CJS resolver picks
dist/handler.jsbeforedist/handler.mjswhen the handler string isnode_modules/datadog-lambda-js/dist/handler.handler. Todaydist/handler.jsships as a verbatim copy ofdist/handler.cjs, so ESM user functions hitERR_REQUIRE_ESM(or the surface "Cannot find module" variant when_tryRequireSyncswallows it).Reported in #782.
Approach
handler.mjsalready handles both CJS and ESM user modules through its asyncload(). The only thing standing in its way isdist/handler.js, which Lambda's resolver picks first. This PR removes the source of that artifact end-to-end:src/handler.cjs(it was only reachable via the shim'srequire("./handler.cjs"), and Lambda's resolver doesn't auto-resolve.cjs).scripts/update_dist_version.shto copy onlysrc/handler.mjsintodist/.RUN rm /nodejs/.../handler.jsin the Dockerfile — there is nohandler.jsto remove now.cp ./dist/handler.cjs ./dist/handler.jslines from.gitlab/scripts/publish_npm.shandscripts/publish_prod.sh(these were the original injection points).End state:
handler.mjsis the sole Lambda entry point in both the npm tarball and the published layer.Why not the runtime-branching shim (earlier commits in this PR)
Initial commits added a
src/handler.jsCJS shim that detected ESM vs CJS user code at runtime and routed accordingly. After reviewer feedback and a closer look at the cold-start path, the shim is the wrong shape:handler.mjs's asyncload(). Negligible against the 500ms-3s a Datadog-instrumented Lambda already spends in init.import()runs post-snapshot.handler.mjsruns all that work under Lambda's init top-levelawait, so PC / SnapStart customers pre-pay the cost and get fast invocations.Aligns with:
handler.cjs/handler.mjsauthor recommending we "only bundle the ESM handler" since all supported runtimes (Node 18+) support top-level await in ESM.Testing Guidelines
yarn buildproduces adist/withhandler.mjsonly — nohandler.js, nohandler.cjs.yarn check-formatting,yarn lint, andyarn testbehave the same asmain(two pre-existing flakes insrc/index.spec.tsaroundnock.isDone()reproduce on plainorigin/mainand are unrelated).esm_node{18,20,22,24}) verifies the layer path with ESM user code. The same path now also covers the npm tarball, since both ship the same handler.Types of Changes
Check all that apply