-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Date / time / interval arithmetic improvements #19460
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| } | ||
| } | ||
|
|
||
| fn temporal_math_coercion( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the new type coercion logic for date time math.
| Ok((left_expr, right_expr)) | ||
| } | ||
|
|
||
| fn coerce_date_time_math_op( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is where the new type coercion for date time math was implemented. Some of the coercion's require multiple casts to generate the desired output
| query D | ||
| SELECT '2001-09-28'::date + interval '1 hour' | ||
| ---- | ||
| 2001-09-28 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this print the time part too ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should actually, the result should actually be timestamp. I'll look into that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an interesting one. Arrow does Date + interval => date (types.rs).
This works as expected for date + int (in pg and for datafusion the int is days) as I transform that to date + interval (as arrow doesn't support date + int) however it's not what is really expected here.
I'll see if I can cast the date + interval to timestamp + interval which should result in the expected output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm hesitant to change this after review. There are existing test cases in timestamps.slt that cover this as well so it's known existing behavior. I'll update the test file to note the difference with postgresql.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, both MySQL and DuckDB (the ones I could check quickly) seem to behave the same as PostgreSQL. The current DataFusion behaviour seems rather unexpected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pepijnve I agree to be honest but I'm a bit hesitant to attempt to fix it in this PR. I'll take another look at it tonight/tomorrow but I'm leaning to having that one be a separate issue and PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed that it probably warrants a separate issue and discussion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with a separate discussion
datafusion/sqllogictest/test_files/datetime/arith_timestamp_duration.slt
Outdated
Show resolved
Hide resolved
…e, timestamp + duration, time - time, time + interval, etc.
…ration.slt Co-authored-by: Martin Grigorov <[email protected]>
Co-authored-by: Martin Grigorov <[email protected]>
4000c0f to
872ae7e
Compare
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @Omega359 @martin-g and @pepijnve -- this is looking really nice and I think the only thing we should fix prior to merge is to add some coverage for other time units. Otherwise I found the code clear to understand and well structured and well tested.
I ran ran code coverage to see if there are any additional obvious cases that are not covered, and it seem like we should add at least basic coverage for the other time units (s, ms, and us)
(from datafusion/datafusion/optimizer/src/analyzer/type_coercion.rs.html)

Here is what I ran
cargo llvm-cov --html --test sqllogictestshere is the report
Thank you @pepijnve and @martin-g for the comments. I would like to know your thoughts and will wait until you think this PR is ready to merge before doing so
| Interval(IntervalUnit::MonthDayNano), | ||
| )) | ||
| } | ||
| // These might seem to be a bit convoluted, however for arrow to do date + time arithmetic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are there additional date/time kernels we should add to arrow-rs that would make this code easier? (as a follow on set of PRs of course)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
date + time -> timestamp
time +/- time -> time
Those are the two that make sense to me to push down into arrow
| # Subtract dates, producing the number of days elapsed | ||
| # date '2001-10-01' - date '2001-09-28' → 3 | ||
|
|
||
| # note that datafusion returns Duration whereas postgres returns an int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we file a follow on ticket for this , I recommend we also leave a link in the code back to the issue
|
Code and tests updated, coverage report now lgtm @alamb |

Which issue does this PR close?
Rationale for this change
Improve coverage of date / time / interval arithmetic operations
What changes are included in this PR?
type coercion improvements, numerous slt tests.
Are these changes tested?
Yes
Are there any user-facing changes?
Additional arithmetic support.
Thanks to @foskey51 for the timestamp + duration fix, @pepijnve for the initial set of .slt tests.