fix: support total_size=0 in byte_range_to_row_range#7115
fix: support total_size=0 in byte_range_to_row_range#7115AdamGS merged 1 commit intovortex-data:developfrom
Conversation
|
If the file has 0 rows, shouldn't |
22bd055 to
8f0f23a
Compare
Fair point. I updated the PR and added a Nit: The changes are in a fixup commit, so they need to be squashed before the PR is merged. |
| let empty_stream = stream::iter(vec![]).boxed(); | ||
| return Ok(empty_stream); |
There was a problem hiding this comment.
just a nit - I think using the buildin utility is nicer
| let empty_stream = stream::iter(vec![]).boxed(); | |
| return Ok(empty_stream); | |
| return Ok(futures::stream::empty().boxed()); |
|
overall LGTM, just running CI to see everything is good |
|
btw - you need to sign-off on your commits for us to merge it, we'll also squash it before merging. |
8f0f23a to
4ce2594
Compare
Done. I used the occasion to also squashed the commits and update the commit message to align with the PR description. |
In certain cases DataFusion might decide to create a ranges for empty
files that have zero rows. In order to avoid hitting a division by zero
error in the
```rust
let average_row = total_size / row_count;
```
line, add an early exit check to `VortexOpener::open` to immediately
return an empty stream if the file contains zero rows:
```rust
if vxf.row_count() == 0 {
let empty_stream = stream::iter(vec![]).boxed();
return Ok(empty_stream);
}
```
The enforced `row_count > 0` invariant for the `byte_range_to_row_range`
call is now also called out with a `debug_assert!` macro.
Add a new `test_open_empty_file` test case.
Signed-off-by: Alexander Alexandrov <alexander.s.alexandrov@gmail.com>
4ce2594 to
18d8814
Compare
|
thank you for contributing this fix! |
Summary
In certain cases DataFusion might decide to create a ranges for empty files that have zero rows. In order to avoid hitting a division by zero error in the
line, add an early exit check to
VortexOpener::opento immediately return an empty stream if the file contains zero rows:The enforced
row_count > 0invariant for thebyte_range_to_row_rangecall is now also called out with adebug_assert!macro.Testing
Add a new
test_open_empty_filetest case.