Support storage-credentials in REST catalog LoadTableResult#3042
Support storage-credentials in REST catalog LoadTableResult#3042kevinjqliu merged 3 commits intoapache:mainfrom
Conversation
2864ab7 to
b2cd702
Compare
# Rationale for this change While reviewing some open PRs #3041, #3042, I noticed CI kept failing on the Python 3.13 job in the hive tests. Turns out [CPython 3.13.12 ](https://docs.python.org/3.13/whatsnew/changelog.html#id3)was just released and included a change python/cpython#142651 which made `Mock.call_count` thread-safe by deriving it from `len(call_args_list)`. This broke our hive test, which was resetting the counter with `mock.call_count = 0` directly. This switches to use reset_mock(), which properly clears all the internal call tracking state. ## Are these changes tested? `make test` passes ## Are there any user-facing changes? no
The Iceberg REST spec's LoadTableResult includes a storage-credentials field for vended credentials (prefix-scoped temporary STS tokens). PyIceberg was only reading the config field and silently dropping storage-credentials, so vended credentials never reached the FileIO. Per the spec: "Clients must first check whether the respective credentials exist in the storage-credentials field before checking the config for credentials." This adds: - storage_credentials field to TableResponse - Longest-prefix credential resolution (mirroring Java's S3FileIO) - Merging resolved credentials into FileIO with highest precedence Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
b3dab54 to
bc9bacb
Compare
Fokko
left a comment
There was a problem hiding this comment.
Thanks @rcjverhoef for working on this, actually suprised that this isn't in yet 😆
| metadata_location: str | None = Field(alias="metadata-location", default=None) | ||
| metadata: TableMetadata | ||
| config: Properties = Field(default_factory=dict) | ||
| storage_credentials: list[StorageCredential] = Field(alias="storage-credentials", default_factory=list) |
There was a problem hiding this comment.
There was a problem hiding this comment.
thanks @Fokko, would you be able to merge this one as well?
|
Thanks for the PR! This lgtm, i want to test it against a catalog integration before merging |
There was a problem hiding this comment.
Pull request overview
Adds support for Iceberg REST catalog storage-credentials in LoadTableResult, ensuring vended credentials are preferred over config credentials per the latest REST spec.
Changes:
- Extends
TableResponseto parsestorage-credentialsfrom REST load table responses. - Adds resolution logic to pick the best-matching credential via longest-prefix match and merge it with existing IO properties.
- Adds unit tests covering prefix resolution and load-table credential precedence behavior.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
pyiceberg/catalog/rest/__init__.py |
Parses storage-credentials and applies the resolved credential config with higher precedence than config when constructing FileIO. |
tests/catalog/test_rest.py |
Adds tests for longest-prefix credential selection and for load_table preferring storage-credentials over config. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
kevinjqliu
left a comment
There was a problem hiding this comment.
LGTM, i tested this locally with Polaris, which supports credential vending (using #3106)
I could not test the longest prefix match logic since polaris currently only supports returning a single storage credential. The logic matches java so lgtm
| best_match: StorageCredential | None = None | ||
| for cred in storage_credentials: | ||
| if location.startswith(cred.prefix): | ||
| if best_match is None or len(cred.prefix) > len(best_match.prefix): |
There was a problem hiding this comment.
Rationale for this change
Vended credentials don't work with REST catalogs following latests specs. Per latest spec:
This is missing in pyiceberg. REST catalogs that don't expose a config field are hence not compatible.
Are these changes tested?
Yes, ran locally against my REST catalog implementation, added tests.
Are there any user-facing changes?
No API changes. REST catalogs that return vended credentials via storage-credentials will now work correctly.