After enabling the Python unit tests on my fork here, this simple test fails:
def test_basic_query_succeed():
df_ctx = SessionContext()
ctx = DatafusionRayContext(df_ctx)
df_ctx.register_csv("tips", "examples/tips.csv", has_header=True)
record_batch = ctx.sql("SELECT * FROM tips")
assert record_batch.num_rows == 244
As one can see from this run
https://github.com/edmondop/datafusion-ray/actions/runs/12268595231/job/34230694956:
the cause is (execute_query_stage pid=8895) index out of bounds: the len is 0 but the index is 0
effectively, the problem comes from the Query Stage code
pub fn get_input_partition_count(&self) -> usize {
self.plan.children()[0]
.properties()
.output_partitioning()
.partition_count()
}
that throws an exception because the CsvExec doesn't have children.
Query stage #0:
CsvExec: file_groups={1 group: [[home/runner/work/datafusion-ray/datafusion-ray/examples/tips.csv]]}, projection=[total_bill, tip, sex, smoker, day, time, size], has_header=true
It might be related to the fact that the unit tests run a local ray instance
After enabling the Python unit tests on my fork here, this simple test fails:
As one can see from this run
https://github.com/edmondop/datafusion-ray/actions/runs/12268595231/job/34230694956:
the cause is
(execute_query_stage pid=8895) index out of bounds: the len is 0 but the index is 0effectively, the problem comes from the Query Stage code
that throws an exception because the
CsvExecdoesn't have children.It might be related to the fact that the unit tests run a local ray instance