[SPARK-57167][SQL] Simplify MakeTimestamp codegen under ANSI mode#56217
Open
gengliangwang wants to merge 2 commits into
Open
[SPARK-57167][SQL] Simplify MakeTimestamp codegen under ANSI mode#56217gengliangwang wants to merge 2 commits into
gengliangwang wants to merge 2 commits into
Conversation
### What changes were proposed in this pull request? Extend `DateTimeExpressionUtils.java` with two static helpers and route `MakeTimestamp`'s eval and codegen paths through them: * `makeTimestampMicros(year, month, day, hour, min, Decimal secAndMicros, ZoneId zoneId, boolean timestampNTZ)`: the shared, exception-raising core. It computes the micros (leap-second `sec = 60` rollover for PostgreSQL compatibility included), throwing `SparkDateTimeException` for an invalid fraction of second and `DateTimeException` for an invalid year/month/day/hour/min. * `makeTimestampExact(...)`: the ANSI (`failOnError = true`) wrapper. It rethrows `SparkDateTimeException` as-is (preserving its message) and translates any other `DateTimeException` to `ansiDateTimeArgumentOutOfRange`. `MakeTimestamp.toMicros` (eval) and `doGenCode` now dispatch on `failOnError`: the ANSI path emits a single `makeTimestampExact(...)` call; the non-ANSI path calls `makeTimestampMicros(...)` inside the existing inline `try/catch -> isNull`. ### Why are the changes needed? Part of SPARK-56908 (umbrella). The ANSI branch of `MakeTimestamp.doGenCode` previously emitted a ~22-line inline block (Decimal floor/nanos math, `LocalDateTime.of`, leap-second handling, the timezone/NTZ conversion, and a two-arm `try/catch` mapping to the ANSI errors) into every generated stage that calls `make_timestamp`. Collapsing it to one helper call shrinks the generated Java source. As a side benefit, eval and codegen now share one implementation, removing a latent divergence (codegen previously derived seconds/nanos via `Decimal.floor` while eval used `toUnscaledLong`). ### Does this PR introduce _any_ user-facing change? No. The compiled behavior is identical; only the emitted Java source text changes. ### How was this patch tested? ``` build/sbt "catalyst/testOnly *DateExpressionsSuite" ``` 75/75 pass (covers make_timestamp / make_timestamp_ntz / make_timestamp_ltz, exercised both with and without whole-stage codegen). ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Claude Code (Opus 4.8) Co-authored-by: Isaac
…r limit) Co-authored-by: Isaac
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Extend
DateTimeExpressionUtils.javawith two static helpers and routeMakeTimestamp's eval and codegen paths through them:makeTimestampMicros(int year, int month, int day, int hour, int min, Decimal secAndMicros, ZoneId zoneId, boolean timestampNTZ): the shared, exception-raising core. It computes the micros (including the leap-secondsec = 60rollover supported for PostgreSQL compatibility), throwingSparkDateTimeExceptionfor an invalid fraction of second andDateTimeExceptionfor an invalid year/month/day/hour/min combination.makeTimestampExact(...): the ANSI (failOnError = true) wrapper. It rethrowsSparkDateTimeExceptionas-is (to preserve its message) and translates any otherDateTimeExceptiontoansiDateTimeArgumentOutOfRange.SparkDateTimeExceptionis caught first because it is itself aDateTimeException.MakeTimestamp.toMicros(eval) anddoGenCodenow dispatch onfailOnError: the ANSI path emits a singlemakeTimestampExact(...)call, while the non-ANSI path callsmakeTimestampMicros(...)inside the existing inlinetry/catch -> isNullform (matching the pattern used by the already-mergedMakeDate/MakeIntervalcleanups).Why are the changes needed?
Part of SPARK-56908 (umbrella). The ANSI branch of
MakeTimestamp.doGenCodepreviously emitted a ~22-line inline block (Decimal floor/nanos math,LocalDateTime.of, leap-second handling, the timezone/NTZ conversion, and a two-armtry/catchmapping to the ANSI errors) into every generated stage that callsmake_timestamp. Collapsing it to a single helper call shrinks the generated Java source, helping with the JVM 64KB method / constant-pool limits, Janino compile time, and JIT work. As a side benefit, eval and codegen now share one implementation, removing a latent divergence (codegen previously derived seconds/nanos viaDecimal.floorwhile eval usedtoUnscaledLong).Does this PR introduce any user-facing change?
No. The compiled behavior is identical; only the emitted Java source text changes.
How was this patch tested?
75/75 pass (covers make_timestamp / make_timestamp_ntz / make_timestamp_ltz, exercised both with and without whole-stage codegen).
Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Code (Opus 4.8)