Add docstring examples for Scalar trigonometric functions#1411
Add docstring examples for Scalar trigonometric functions#1411ntjohnson1 wants to merge 1 commit intoapache:mainfrom
Conversation
Add example usage to docstrings for Scalar trigonometric functions to improve documentation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| @pytest.fixture(autouse=True) | ||
| def _doctest_namespace(doctest_namespace: dict) -> None: | ||
| """Add common imports to the doctest namespace.""" | ||
| doctest_namespace["dfn"] = dfn |
There was a problem hiding this comment.
This auto imports dfn and np to save 2 lines in each example.
| >>> from math import pi | ||
| >>> ctx = dfn.SessionContext() | ||
| >>> df = ctx.from_pydict({"a": [pi / 4]}) | ||
| >>> import builtins |
There was a problem hiding this comment.
If we don't like builtins then doctest has an ellipses notation for the expected returned values as well.
kosiew
left a comment
There was a problem hiding this comment.
Thanks for working on this.
There was a problem hiding this comment.
Could this live in a repo-level or python/-level conftest.py instead of inside the package?
The new pytest fixture is test-only infrastructure, but it now lives under python/datafusion/, which is the shipped package tree.
That makes the package layout less cohesive and couples the runtime module directory to pytest-specific setup.
A higher-level conftest.py would still support doctests while keeping test wiring out of the library package.
There was a problem hiding this comment.
Yes. We should be able to place it at root. Will test. I tried to add it to the existing conftest.py that is nested under the tests but that didn't seem to pickup.
| >>> df = ctx.from_pydict({"y": [0.0], "x": [1.0]}) | ||
| >>> result = df.select( | ||
| ... dfn.functions.atan2(dfn.col("y"), dfn.col("x")).alias("atan2")) | ||
| >>> result = result |
There was a problem hiding this comment.
Will remove duplicate
| @@ -491,16 +491,28 @@ def abs(arg: Expr) -> Expr: | |||
| def acos(arg: Expr) -> Expr: | |||
There was a problem hiding this comment.
I think it would be helpful to extract shared example setup, or align the example style with existing docstrings.
Most of the new examples duplicate the same SessionContext / from_pydict / select / collect_column flow with only the function name and literal changing. That is fine for a few functions, but the repetition will get expensive as more scalar helpers gain examples.
It would be worth standardizing on a small doctest helper namespace or a more consistent public-facing import style so future additions do not copy-paste boilerplate across dozens of docstrings.
There was a problem hiding this comment.
I think there is value in the examples being copy pastable/fully stand alone if it's only a few lines of boilerplate. For reference numpy https://numpy.org/doc/2.4/reference/generated/numpy.inner.html#numpy.inner.
Expensive computationally or for maintenance? This is inline with the style from the existing doc examples as far as I can tell https://github.com/rerun-io/datafusion-python/blob/231ed2b1d375fefe9aa01cdc8ae41c620c772f76/python/datafusion/dataframe.py#L324
There was a problem hiding this comment.
I see your point and it is defensible that each example is self-contained.
Expensive computationally or for maintenance?
maintenance.
Which issue does this PR close?
Rationale for this change
Adding doctested examples to the docstrings makes things nicer to use for humans and agents.
What changes are included in this PR?
The first PR was basically adding a docstring to everything in functions. I broke it apart into a PR (that already merged) for the infra. Then will do a variety of follow ups with groups of functions so it is easier to review. This is the first one to get feedback if this is a reasonable scope. Everything is co-authored with Claude since I used claude to extend the handwritten examples I wrote for reference and to split apart the large PR rather than doing it manually.
I've reviewed all the code prior to PR.
Are there any user-facing changes?
No