Skip to content

[SPARK-56295][PYTHON] Add Java error classes to Python side#55100

Open
gaogaotiantian wants to merge 3 commits intoapache:masterfrom
gaogaotiantian:load-java-error-classes
Open

[SPARK-56295][PYTHON] Add Java error classes to Python side#55100
gaogaotiantian wants to merge 3 commits intoapache:masterfrom
gaogaotiantian:load-java-error-classes

Conversation

@gaogaotiantian
Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

Include Java side error classes in Python so python can generate consistent error messages as Java with the same error class + message parameters.

Why are the changes needed?

Currently we have two separate error-conditions.json files, for Python and Java. I think it's historical reason that we can't easily load Java's json on Python so we created a Python specific one. The issue is that two files are unsynced. It's possible that we want to raise the same error on both Python and Java, and we either need to create a new error class on Python json file or forget to do that and break message generation.

Ideally we should have a single source of truth, which is achievable now by merging every existing Python error classes into Java one. But there could be unexpected consequences and it's difficult to split them back once it's merged. Moreover, the Python file contains some python-specific error classes which may not fit into the java file.

As a compromise, we keep the Python file and load the Java file from Python. We keep the current structure and make it more flexible. In the future, if we have a clear plan of which way we should go, we can always eliminate the Python json file.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

A new test is added to test Python can raise Java-specific error class.

Was this patch authored or co-authored using generative AI tooling?

No.

ERROR_CLASSES_MAP = json.loads(ERROR_CLASSES_JSON)


def get_error_classes() -> dict[str, dict]:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this impact python Spark Connect client-only installations? They would not have the jars installed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. So for now, it won't generate any error. The connect client would miss the java side error classes. As long as connect client only uses python error classes it should be fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants