refactor(storage)!: sharpen Get and List contracts#122
Conversation
|
|
| Get(ctx context.Context, req DownloadRequest) (*DownloadResponse, error) | ||
| // Get downloads a blob from the storage. On success the returned DownloadResponse.ReadCloser | ||
| // is non-nil and the caller owns closing it. Returns NotFoundError when the blob is not found. | ||
| Get(ctx context.Context, req DownloadRequest) (DownloadResponse, error) |
There was a problem hiding this comment.
could we still leave this a pointer?
- this is a breaking change downstream for uber
*DownloadResponsecontainsio.ReadCloserthat's already a pointer. if return by value, on error it returnsDownloadResponse{}with a zero-valueReadCloser==nil. The caller still has to checkresp.ReadCloser != nil
There was a problem hiding this comment.
I do not want to accumulate a tech debt at this phase. It is not in prod yet and we should bring the API in accordance.
The caller still has to check resp.ReadCloser != nil
For internal interfaces the better approach is to use static checks (Go does not have them) or conventions rather than explicit checks which in well designed application are always dead branch code.
https://medium.com/javarevisited/avoid-verbose-null-checks-b3f11afbfcc9
The interface should contract what object is returned in what conditions. Almost always a non-nil error means that the value object is undefined, means the client should perform no checks and do not use it in any capacity.
Two contract clarifications on the Storage interface, both addressing abstraction leaks where the documented behavior diverged from what implementations actually did. Get returns DownloadResponse by value, not by pointer. The pointer carried no extra information — absence is signaled by NotFoundError, not by a nil response — yet every caller defensively checked `resp == nil` and `resp.ReadCloser == nil`, both dead code under any conforming implementation. The interface now documents that on a nil error the ReadCloser is non-nil and caller-owned, and the redundant checks are removed. List has literal-string-prefix semantics instead of "directory prefix". The original wording was ambiguous and the two bundled implementations diverged: memstorage did strings.HasPrefix while disk did filepath.Walk on a real subdirectory, so the contracts only agreed when the caller passed a full slash-segmented path ending in "/". The interface now mandates strings.HasPrefix semantics — the natural primitive for S3/GCS/in-memory backends — and disk.List walks the longest slash-bounded prefix and filters by HasPrefix, preserving cheap walks for the common case while honoring partial-segment prefixes. Storage gains an interface-level comment stating keys are opaque and the interface has no concept of paths or directories; any structure is a caller convention. Breaking change: out-of-tree implementations of Storage must update Get to return a value and List to do literal-prefix matching. All in-tree implementations (memstorage, disk), callers, mocks, and tests are updated. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Summary
Two contract clarifications on the
Storageinterface, both addressing abstraction leaks where the documented behavior diverged from what the bundled implementations actually did. The interface is meant to fit cleanly behind backends as different as in-memory maps, local disk, S3, GCS, and remote RPC services — these changes make that easier and remove dead defensive code from callers.1.
GetreturnsDownloadResponseby valueBefore:
After:
The pointer carried no information: absence is signaled by
NotFoundError, not by a nil response. Every caller defensively checkedresp == nilandresp.ReadCloser == nil— both dead code under any conforming implementation. The tightened contract eliminates fourresp == nilchecks and threeresp.ReadCloser == nilchecks acrosscontroller/,core/storage/graphreader.go, andcore/storage/changedtargetsreader.go.This also lines up with
CLAUDE.md's style guidance: "Value types over pointers — prefer value types for structs, configs, and return values. Use (T, bool) to signal absence instead of *T. Pointers only when mutation or shared ownership is needed."2.
Listis now literal-string-prefix, not "directory prefix"Before:
After:
The original wording was ambiguous and the two bundled implementations diverged.
memstoragedidstrings.HasPrefix(a pure string match);diskdidfilepath.Walkon a real subdirectory. The two contracts agreed only when the caller passed a full slash-segmented path ending in/— which the sole production caller (storageCache.FloorKey) happens to do, but a future backend like S3 (whose natural primitive isListObjects(prefix)) couldn't be plugged in without subtly different behavior.The new contract picks the primitive that all plausible backends can satisfy without contortion.
disk.Listis updated to honor it: it walks the longest slash-bounded path-prefix and filters bystrings.HasPrefix, preserving cheap walks for the common case while correctly returning sibling keys for partial-segment prefixes.3. Interface-level doc on
StorageBreaking change
Out-of-tree
Storageimplementations need to update:Getto returnDownloadResponseby value, and return the zero value alongside any error (includingNotFoundError).Listto do literal-string-prefix matching.All in-tree implementations (
memstorage,disk), callers (controller,orchestrator,core/itg/cache), mocks, and tests are updated in this PR. The only production caller ofList(storageCache.FloorKey) already passes a slash-terminated prefix, so its behavior is unchanged.Test plan
go test ./...green for all affected packagesmake buildgreenmake testgreen — 14/14 Bazel targets passcore/storage/disk/disk_test.go(TestStorage_List) that lock in literal-prefix behavior and would have failed under the old directory-walk implementation:🤖 Generated with Claude Code