-
Notifications
You must be signed in to change notification settings - Fork 118
Fix list-of-struct nested projection #6012
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Fix list-of-struct nested projection #6012
Conversation
|
I wasn't able to apply labels from a fork (permission denied). Could a maintainer please add the ix label for the changelog/CI gating? |
| // List-of-struct field access: `$.items.a` where `items: list<struct{a,b}>`. | ||
| match input.dtype() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't really want to tie the get_item to have this for_each element behaviour.
I am not sure we want get_item to pierce thought lists, we would want a this to be map_list(get_item("a", _), get_item("items", root()) I am not sure we have designed the map_list or the _ expression yet?
|
This is a great contribution, however we need to design the list expression more carefully |
Codecov Report❌ Patch coverage is ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Started a discussion: #6030, we can reopen once we find a good answer there. Maybe you can expand on your use case there. |
8a8803c to
11f06b6
Compare
| row_range: &Range<u64>, | ||
| splits: &mut BTreeSet<u64>, | ||
| ) -> VortexResult<()> { | ||
| splits.insert(row_range.end); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The one thing I struggled with the last time I tried to do ListLayout is that it's hard to split naturally.
Ideally you'd have a way to split based on the elements, then turn that back into row ranges...
d5a8576 to
87d9553
Compare
Summary
vortex.listlayout forList/FixedSizeListcolumnslist<struct>(e.g. projectingitems.a)vortex.get_itemand the list reader to supportlist<struct>Why
Tests
cargo +nightly fmt --all --checkcargo clippy --locked --workspace --all-features --all-targets --exclude vortex-bench --exclude vortex-python --exclude vortex-duckdb --exclude vortex-fuzz --exclude duckdb-bench --exclude lance-bench --exclude datafusion-bench --exclude random-access-bench --exclude compress-bench -- -D warningscargo nextest run --locked --workspace --all-features --no-fail-fast --exclude vortex-bench --exclude vortex-python --exclude vortex-duckdb --exclude vortex-fuzz --exclude duckdb-bench --exclude lance-bench --exclude datafusion-bench --exclude random-access-bench --exclude compress-bench(fails locally:vortex-cuda for_::tests::test_cuda_for_decompressionrequires a CUDA driver)cargo nextest run --locked --workspace --all-features --no-fail-fast --exclude vortex-cuda --exclude vortex-bench --exclude vortex-python --exclude vortex-duckdb --exclude vortex-fuzz --exclude duckdb-bench --exclude lance-bench --exclude datafusion-bench --exclude random-access-bench --exclude compress-benchCompatibility
vortex.list). Older readers that don't know this encoding won't be able to read those files; new readers continue to read existing files.References