Decode predictor=2 on big-endian TIFFs by swapping to native order#1507
Merged
brendancol merged 1 commit intomainfrom May 7, 2026
Merged
Decode predictor=2 on big-endian TIFFs by swapping to native order#1507brendancol merged 1 commit intomainfrom
brendancol merged 1 commit intomainfrom
Conversation
PR #1498 reworked predictor=2 to run sample-wise via a numpy view at the file's byte order. Numba's nopython mode rejects arrays with a non-native byte order, so reading a big-endian TIFF with uint16/uint32/uint64 + predictor=2 raised "Unsupported array dtype: >u2" instead of returning the pixel data. predictor_decode and predictor_encode now byteswap the buffer in place around the kernel call for files whose byte order differs from native. Bytes on the way out stay in the file's order so the downstream chunk.view(file_dtype) step in _decode_strip_or_tile keeps working. uint8 is unaffected (single-byte path skips the view). Tests cover uint16/int16/uint32/int32 round-trips through tifffile-built big-endian predictor=2 files plus a little-endian sanity check.
brendancol
added a commit
to brendancol/xarray-spatial
that referenced
this pull request
May 8, 2026
…yteswap The earlier implementation, ``arr.view(arr.dtype.newbyteorder()).copy()``, left the result tagged with non-native byteorder (``>u2`` instead of ``<u2``). That's values-equivalent for arithmetic but breaks downstream consumers that expect native dtypes -- numba ``@ngjit`` rejects non-native arrays, which is the same class of bug PR xarray-contrib#1507 fixed for predictor=2 BE. The new implementation reverses bytes through a uint8 view: u8 = arr.view('u1').reshape(*arr.shape, arr.itemsize) return u8[..., ::-1].copy().view(arr.dtype).reshape(arr.shape) Result preserves ``arr.dtype`` and is native-endian, matching numpy's ``ndarray.byteswap()`` contract. 1-byte dtypes short-circuit to a no-op return. Tests now assert ``gpu_da.data.dtype.isnative`` and equality against the input native dtype, plus two pure-numpy tests of the helper itself. The module-level ``pytest.skip`` for missing CUDA was widened to cover both helper tests too; pulled it apart so the helper tests run without a GPU and only the GPU end-to-end tests gate on cupy+CUDA+tifffile.
brendancol
added a commit
that referenced
this pull request
May 8, 2026
* Fix read_geotiff_gpu byteswap on big-endian multi-byte TIFFs cupy.ndarray (13.x) does not expose .byteswap(), so any BE multi-byte TIFF hit AttributeError inside the GPU decode pipeline. The dispatcher in read_geotiff_gpu caught it and silently fell back to CPU, so output stayed correct but the GPU path was effectively dead for BE data. Replace both arr.byteswap() calls with a small helper that views the array as the swapped-order dtype and copies, which works on numpy and cupy arrays alike. Closes #1508 * Address PR #1515 review: preserve native dtype in _xp_byteswap The earlier implementation, ``arr.view(arr.dtype.newbyteorder()).copy()``, left the result tagged with non-native byteorder (``>u2`` instead of ``<u2``). That's values-equivalent for arithmetic but breaks downstream consumers that expect native dtypes -- numba ``@ngjit`` rejects non-native arrays, which is the same class of bug PR #1507 fixed for predictor=2 BE. The new implementation reverses bytes through a uint8 view: u8 = arr.view('u1').reshape(*arr.shape, arr.itemsize) return u8[..., ::-1].copy().view(arr.dtype).reshape(arr.shape) Result preserves ``arr.dtype`` and is native-endian, matching numpy's ``ndarray.byteswap()`` contract. 1-byte dtypes short-circuit to a no-op return. Tests now assert ``gpu_da.data.dtype.isnative`` and equality against the input native dtype, plus two pure-numpy tests of the helper itself. The module-level ``pytest.skip`` for missing CUDA was widened to cover both helper tests too; pulled it apart so the helper tests run without a GPU and only the GPU end-to-end tests gate on cupy+CUDA+tifffile.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
PR #1498 reworked predictor=2 (horizontal differencing) to run sample-wise via a numpy view in the file's byte order. Numba's nopython mode rejects arrays with a non-native byte order, so reading a big-endian TIFF with uint16/uint32/uint64 + predictor=2 raised
TypingError: Unsupported array dtype: >u2instead of returning the pixel data.predictor_decodeandpredictor_encodenow byte-swap the buffer in place around the kernel call when the file byte order differs from native, then swap back so the on-disk representation stays intact for thechunk.view(file_dtype)step in_decode_strip_or_tile. uint8 is unaffected since the single-byte path skips the view.Test plan
test_predictor2_big_endian.pycovers uint16/int16/uint32/int32 round-trips through tifffile-built BE predictor=2 deflate files.test_featuresrecursion issue is unrelated).