Skip to content

Single-node Python unit tests fail #52

@edmondop

Description

@edmondop

After enabling the Python unit tests on my fork here, this simple test fails:

def test_basic_query_succeed():
    df_ctx = SessionContext()
    ctx = DatafusionRayContext(df_ctx)
    df_ctx.register_csv("tips", "examples/tips.csv", has_header=True)
    record_batch = ctx.sql("SELECT * FROM tips")
    assert record_batch.num_rows == 244

As one can see from this run
https://github.com/edmondop/datafusion-ray/actions/runs/12268595231/job/34230694956:

the cause is (execute_query_stage pid=8895) index out of bounds: the len is 0 but the index is 0

effectively, the problem comes from the Query Stage code

    pub fn get_input_partition_count(&self) -> usize {
        self.plan.children()[0]
            .properties()
            .output_partitioning()
            .partition_count()
    }

that throws an exception because the CsvExec doesn't have children.

Query stage #0:
CsvExec: file_groups={1 group: [[home/runner/work/datafusion-ray/datafusion-ray/examples/tips.csv]]}, projection=[total_bill, tip, sex, smoker, day, time, size], has_header=true

It might be related to the fact that the unit tests run a local ray instance

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions