Skip to content

refac: Harmonize linopy operations with breaking convention#591

Draft
FBumann wants to merge 19 commits intoharmonize-linopy-operationsfrom
harmonize-linopy-operations-mixed
Draft

refac: Harmonize linopy operations with breaking convention#591
FBumann wants to merge 19 commits intoharmonize-linopy-operationsfrom
harmonize-linopy-operations-mixed

Conversation

@FBumann
Copy link
Collaborator

@FBumann FBumann commented Feb 20, 2026

Arithmetic Convention for linopy

Strict coordinate alignment for linopy: mismatches are loud, broadcasting preserves algebra.

Convention

Exact match on shared dimensions — When two operands share a dimension, their labels must match exactly (join="exact"). Raises ValueError on mismatch.

Free broadcasting on non-shared dimensions — Constants and expressions can introduce new dimensions. All standard algebraic laws hold (commutativity, associativity, distributivity).

x[time] + y[time]                      # ✓ same dims, matching coords
cost[tech] * gen[time, tech]            # ✓ constant broadcasts over time
x[time] + profile[tech, time]           # ✓ constant introduces tech
gen[time, tech] + risk[tech, scenario]  # ✓ expr+expr broadcasts

x[time] + y_short[time=0:2]            # ✗ mismatched coords on shared dim

Escape hatches: `.sel()` to subset, `.add(join="outer")` for explicit join, `linopy.align()` for pre-alignment.

Changes vs. master

  • Exact join default: All arithmetic (`+`, `-`, `*`, `/`) and constraint operators (`<=`, `>=`, `==`) now default to `join="exact"` instead of xarray's inner join
  • `linopy.align()` default: Changed from `join="inner"` to `join="exact"`, and delegates to each type's `.reindex()` for correct sentinel fill values
  • New `.reindex()` methods: `Expression.reindex(fill_value=NaN)` and `Variable.reindex()` with type-aware sentinel handling
  • Named arithmetic methods: `.add(join=...)`, `.sub()`, `.mul()`, `.div()` for explicit join control
  • New tests: `test/test_algebraic_properties.py` — spec and tests for all algebraic laws
  • Documentation: `examples/arithmetic-convention.ipynb`

Open Questions / TODOs

  • Pipe operator — Discuss desired API.
  • Rollout strategy — Clean break vs. opt-in transition period (e.g. `Model(strict_alignment=True)`)?
  • Deprecation warnings — Warn before switching to exact join as default?

FBumann and others added 5 commits February 20, 2026 09:01
Use "exact" join for +/- (raises ValueError on mismatch), "inner" join
for *// (intersection), and "exact" for constraint DataArray RHS.
Named methods (.add(), .sub(), .mul(), .div(), .le(), .ge(), .eq())
accept explicit join= parameter as escape hatch.

- Remove shape-dependent "override" heuristic from merge() and
  _align_constant()
- Add join parameter support to to_constraint() for DataArray RHS
- Forbid extra dimensions on constraint RHS
- Update tests with structured raise-then-recover pattern
- Update coordinate-alignment notebook with examples and migration guide

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@FBumann
Copy link
Collaborator Author

FBumann commented Feb 20, 2026

@FabianHofmann Im quite happy with the notebook now. It showcases the convention and its consequences.
Tests need some work though. And migration as well.
Looking forward to your opinion on the convention

…ords. Here's what changed:

  - test_linear_expression_sum / test_linear_expression_sum_with_const: v.loc[:9].add(v.loc[10:], join="override") → v.loc[:9] + v.loc[10:].assign_coords(dim_2=v.loc[:9].coords["dim_2"])
  - test_add_join_override → test_add_positional_assign_coords: uses v + disjoint.assign_coords(...)
  - test_add_constant_join_override → test_add_constant_positional: now uses different coords [5,6,7] + assign_coords to make the test meaningful
  - test_same_shape_add_join_override → test_same_shape_add_assign_coords: uses + c.to_linexpr().assign_coords(...)
  - test_add_constant_override_positional → test_add_constant_positional_different_coords: expr + other.assign_coords(...)
  - test_sub_constant_override → test_sub_constant_positional: expr - other.assign_coords(...)
  - test_mul_constant_override_positional → test_mul_constant_positional: expr * other.assign_coords(...)
  - test_div_constant_override_positional → test_div_constant_positional: expr / other.assign_coords(...)
  - test_variable_mul_override → test_variable_mul_positional: a * other.assign_coords(...)
  - test_variable_div_override → test_variable_div_positional: a / other.assign_coords(...)
  - test_add_same_coords_all_joins: removed "override" from loop, added assign_coords variant
  - test_add_scalar_with_explicit_join → test_add_scalar: simplified to expr + 10
@FBumann
Copy link
Collaborator Author

FBumann commented Feb 27, 2026

The convention should be "exact" for all of +, -, *, /, with an additional check that neither side may introduce dimensions the other doesn't have — also for all operations.

Why "exact" instead of "inner" for * and /

"exact" still broadcasts freely over dimensions that only exist on one side — it only enforces strict matching on shared dimensions. So the common scaling pattern works fine:

cost = xr.DataArray([10, 20], coords=[("tech", ["wind", "solar"])])
capacity  # dims: (tech=["wind", "solar"], region=["A", "B"])

cost * capacity  # ✓ tech matches exactly, region broadcasts freely

"inner" is dangerous: if coords on a shared dimension don't match due to a typo or upstream change, it silently drops values. The explicit and safe way to subset before multiplying is:

capacity.sel(tech=["wind", "solar"]) * renewable_cost

No operation should introduce new dimensions

Neither side of any arithmetic operation should be allowed to introduce dimensions the other doesn't have. The same problem applies to + and - as to * and / — new dimensions silently expand the optimization problem in unintended ways:

cost_expr      # dims: (tech, time)
regional_expr  # dims: (tech, time, region)

cost_expr + regional_expr  # ✗ silently expands to (tech, time, region)

capacity  # dims: (tech, region, time)
risk      # dims: (tech, scenario)
risk * capacity  # ✗ silently expands to (tech, region, time, scenario)

An explicit pre-check on all operations:

asymmetric_dims = set(other.dims).symmetric_difference(set(self.dims))
if asymmetric_dims:
    raise ValueError(f"Operation introduces new dimensions: {asymmetric_dims}")

Summary

Operation Convention
+, -, *, / "exact" on shared dims; neither side may introduce dims the other doesn't have

@coroa
Copy link
Member

coroa commented Feb 27, 2026

The convention should be "exact" for all of +, -, *, /, with an additional check that neither side may introduce dimensions the other doesn't have — also for all operations.

Let's clearly differentiate between dimensions and labels.

labels

I agree with "exact" for labels by default, but we need an easy way to have inner or outer joining characteristics. I found the pyoframe conventions
strange at the beginning, but they grew on me:

x + y.keep_extras() to say that an outer join is in order and mismatches should fill with 0.

x + y.drop_extras() to say that you want an outer inner join.
x.drop_extras() + y does the same, though.

I have in a different project used | 0 to indicate keep_extras ie (x + y | 0).

dimensions

i am actually fond of the ability to auto broadcast over different dimensions. and would want to keep that (actually my main problem with pyoframe).

your first example actually implicitly assumes broadcasting.

@FBumann
Copy link
Collaborator Author

FBumann commented Feb 28, 2026

Dimensions and broadcasting

I agree that auto broadcasting is helpful in some cases.
I'm happy with allowing broadcasting of constants. We could allow this always...?
But I would enforce that the constant never has more dims than the variable/expression.
Or is there a use case for this?

So the full convention requires two separate things:
1. "exact" join — shared dims must have matching coords (xarray handles this)
2. Subset dim check — the constant side’s dims must be a subset of the variable/expression (custom pre-check needed)

labels

I'm not sure if I like this approach, as it's needs careful state management of the flags on expressions. The flag (keep or drop extras) needs to be handled.
I would rather enforce to reindex or fill data to the correct index.
I think aligning is the correct approach:

import linopy

# outer join — fill gaps with 0 before adding
x_aligned, y_aligned = linopy.align(x, y, join="outer", fill_value=0)
x_aligned + y_aligned

# inner join — drop non-matching coords before adding
x_aligned, y_aligned = linopy.align(x, y, join="inner")
x_aligned + y_aligned

Combining disjoint expressions would then still need the explicit methods though.
I'm interested about your take on this

@FBumann
Copy link
Collaborator Author

FBumann commented Feb 28, 2026

The proposed convention for all arithmetic operations in linopy:
1. "exact" join by default — shared coords must match exactly, raises on mismatch
2. Subset dim check — constants may introduce dimensions the variable/expression doesn’t have
3. No implicit inner join — use .sel() explicitly instead
4. Outer join with fill — use x + (y | 0) or .add(join="outer", fill_value=0)
The escape hatches in order of preference: .sel() for subsetting, | 0 for inline fill, named method .add(join=...) for everything else. No context manager needed.​​​​​​​​​​​​​​​​

I'm not sure how to implement the | operator yet. Might need some sort of flag/state for defered indexing

@FBumann
Copy link
Collaborator Author

FBumann commented Feb 28, 2026

I thought about the pipe operator:
I think it should only work with linopy internal types (Variables/expression), not constants (scalar, numpy, pandas, dataarray), as this would need monkey patching a lot and hard to get stable.

Would this be an issue for you?

FBumann and others added 12 commits March 9, 2026 17:43
… constants

Implements the new arithmetic convention for all operations (+, -, *, /):
- Rule 1: Exact label matching on shared dimensions (join="exact")
- Rule 2: Constants cannot introduce new dimensions not in the expression

Adds escape hatches: FillWrapper via `expr | 0`, named methods with
explicit join= parameter, and linopy.align() with configurable join.

Changes FILL_VALUE["const"] from NaN to 0 for cleaner semantics.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Instead of unwrapping to Datasets, aligning, and manually reconstructing
linopy types, align now calls each object's own .reindex() which handles
type-specific fill values (vars=-1, coeffs=NaN, const=0) automatically.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Remove FillWrapper class and | operator (deferred for later design)
- Expression.reindex: scalar fill_value applies to const only,
  vars/coeffs always use sentinels (-1/NaN)
- Variable.reindex: no fill_value param, always uses sentinels
- Update notebook: remove | 0 section, fix align example
- Clean up __init__.py exports

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Default fill_value=0, always applies to const only. No dict pass-through.
vars/coeffs always use sentinels.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Tests verify commutativity, associativity, distributivity, identity,
and negation laws. Two known breakages (associativity and distributivity
with constants that introduce new dims) are marked xfail.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Spec and tests for commutativity, associativity, distributivity,
identity, negation, and zero. Two known violations marked xfail:
associativity and distributivity with constants that introduce new dims.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Constants can now introduce new dimensions in arithmetic (+, -, *, /),
preserving all standard algebraic laws (associativity, distributivity).
The dim-subset check remains for constraint RHS to catch accidental
broadcasting. Default fill value for const changed from 0 to NaN.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Constraint RHS can now introduce new dimensions, just like arithmetic.
For ==, broadcasting to incompatible values results in solver infeasibility.
For <=/>= it creates redundant but harmless constraints.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Pure xarray/pandas/numpy operations before entering linopy use their
own alignment rules. Document the risks and the xarray exact join
workaround.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Document assign_coords (recommended) and join="override" for handling
operands with mismatched coordinate labels.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@FBumann
Copy link
Collaborator Author

FBumann commented Mar 9, 2026

@coroa @FabianHofmann Im quite happy with this. Looking forward to your thoughts.
I updateted the PR description to the current state

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants