Bug
open_geotiff returns different attrs keys depending on which backend handles the read. The eager numpy path (xrspatial/geotiff/__init__.py:411-510) populates several pass-through attrs from the file's TIFF tags. The dask path (__init__.py:1333-1342) and the GPU path (__init__.py:1881-1888) only set a small subset.
Keys dropped on dask read:
x_resolution, y_resolution, resolution_unit (TIFF resolution tags)
extra_tags (raw pass-through tag list)
image_description (tag 270)
extra_samples (tag 338, alpha indication)
Keys dropped on GPU read:
Reproducer
import numpy as np, xarray as xr
from xrspatial.geotiff import to_geotiff, open_geotiff
arr = np.random.random((128, 128)).astype(np.float32)
da = xr.DataArray(arr, dims=['y', 'x'], attrs={
'x_resolution': 300.0,
'y_resolution': 300.0,
'resolution_unit': 'inch',
})
path = '/tmp/attrs_test.tif'
to_geotiff(da, path, compression='deflate', tiled=True, tile_size=64, nodata=-1.0)
np_da = open_geotiff(path)
dk_da = open_geotiff(path, chunks=64)
gpu_da = open_geotiff(path, gpu=True)
print('numpy attrs :', sorted(np_da.attrs.keys()))
print('dask_cpu attrs:', sorted(dk_da.attrs.keys()))
print('cupy attrs :', sorted(gpu_da.attrs.keys()))
print('x_resolution: np=', np_da.attrs.get('x_resolution'),
'dk=', dk_da.attrs.get('x_resolution'),
'gpu=', gpu_da.attrs.get('x_resolution'))
print('nodata: np=', np_da.attrs.get('nodata'),
'dk=', dk_da.attrs.get('nodata'),
'gpu=', gpu_da.attrs.get('nodata'))
Observed:
numpy attrs : ['nodata', 'resolution_unit', 'transform', 'x_resolution', 'y_resolution']
dask_cpu attrs: ['nodata', 'transform']
cupy attrs : ['transform']
x_resolution: np= 300.0 dk= None gpu= None
nodata: np= -1.0 dk= -1.0 gpu= None
Why this matters
to_geotiff reads attrs['extra_tags'] and the friendly resolution accessors when reconstructing the output file. If those attrs are dropped on read, write-then-read-then-write loses metadata. Downstream code that branches on attrs['x_resolution'] or attrs['nodata'] quietly behaves differently depending on which backend is active.
#1542 / PR #1547 fixes the GPU nodata drop. This issue covers the broader attrs class that both the dask and GPU read paths still drop.
Expected
All four backends (numpy, cupy, dask+numpy, dask+cupy) should populate the same attrs keys for the same input file. The eager numpy attrs set is the canonical reference.
Fix sketch
Factor the attrs population out of the eager numpy branch into a helper _populate_attrs_from_geo_info(geo_info, attrs) and call it from all three read paths so they cannot diverge again. Add a 4-backend equivalence test.
Audit pass
Found in geotiff backend parity sweep on 2026-05-09. Reproduced cleanly on this host on commit c41dfa6.
Bug
open_geotiffreturns differentattrskeys depending on which backend handles the read. The eager numpy path (xrspatial/geotiff/__init__.py:411-510) populates several pass-through attrs from the file's TIFF tags. The dask path (__init__.py:1333-1342) and the GPU path (__init__.py:1881-1888) only set a small subset.Keys dropped on dask read:
x_resolution,y_resolution,resolution_unit(TIFF resolution tags)extra_tags(raw pass-through tag list)image_description(tag 270)extra_samples(tag 338, alpha indication)Keys dropped on GPU read:
nodata(already tracked by read_geotiff_gpu skips nodata masking and drops attrs['nodata'] #1542 / PR Apply nodata mask in read_geotiff_gpu (#1542) #1547)Reproducer
Observed:
Why this matters
to_geotiffreadsattrs['extra_tags']and the friendly resolution accessors when reconstructing the output file. If those attrs are dropped on read, write-then-read-then-write loses metadata. Downstream code that branches onattrs['x_resolution']orattrs['nodata']quietly behaves differently depending on which backend is active.#1542 / PR #1547 fixes the GPU
nodatadrop. This issue covers the broader attrs class that both the dask and GPU read paths still drop.Expected
All four backends (numpy, cupy, dask+numpy, dask+cupy) should populate the same
attrskeys for the same input file. The eager numpy attrs set is the canonical reference.Fix sketch
Factor the attrs population out of the eager numpy branch into a helper
_populate_attrs_from_geo_info(geo_info, attrs)and call it from all three read paths so they cannot diverge again. Add a 4-backend equivalence test.Audit pass
Found in geotiff backend parity sweep on 2026-05-09. Reproduced cleanly on this host on commit
c41dfa6.