Skip to content

fix: delete zero-byte cache files on load instead of leaving stale locks#23

Closed
jburos wants to merge 1 commit into
nsiccha:devfrom
generable:dev
Closed

fix: delete zero-byte cache files on load instead of leaving stale locks#23
jburos wants to merge 1 commit into
nsiccha:devfrom
generable:dev

Conversation

@jburos
Copy link
Copy Markdown

@jburos jburos commented Apr 2, 2026

Summary

When a computation is interrupted mid-write (process kill, OOM, crash), the cache file is left as a zero-byte placeholder. get_cache_status returns :started for these files.

The current handling warns but leaves the file in place, which means subsequent recomputation attempts also fail — the zero-byte file acts as a permanent lock requiring manual deletion.

Relation to 5185bd5

Commit 5185bd5 ("retry after deserialization failure") handles corrupted non-empty files that fail to deserialize. This PR handles the complementary case: zero-byte files left by interrupted writes.

Case 5185bd5 This PR
Non-empty file, deserialization error ✅ delete + retry
Zero-byte file (:started status) ✗ warns, leaves file ✅ delete + retry

Together they cover both failure modes cleanly.

Change

# Before
cache_status == :started && @warn "Cache file $cache_path exists but has size 0.\nAssuming a previous run failed."
touch(cache_path)

# After
if cache_status == :started
    @warn "Cache file $cache_path exists but has size 0. Removing and recomputing."
    rm(cache_path)
end
touch(cache_path)

🤖 Generated with Claude Code

When a computation is interrupted mid-write, the cache file is left as a
zero-byte placeholder (:started status). Previously this just warned and
left the file in place, which caused subsequent recomputation attempts to
fail (e.g. AssertionError on __status__ checks or other errors) because
the zero-byte file continued to be detected as :started on every retry.

This change deletes the zero-byte file before touch()-ing a fresh one,
so the recomputation proceeds exactly as it would for a fresh (:unstarted)
cache miss.

Complements 5185bd5 ("retry after deserialization failure") which handles
corrupted non-empty files. Together they cover both failure modes:
- Non-empty file that fails to deserialize → caught, deleted, retried (5185bd5)
- Zero-byte file left by interrupted write → deleted, retried (this PR)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@nsiccha
Copy link
Copy Markdown
Owner

nsiccha commented Apr 2, 2026

Hm, I had at some point added the more more verbose warnings and the error to figure out what exactly was going on. The issue was that in the dependency graph, it makes sense to cache expensive leaves - but if they are accessed via multiple paths, these paths don't synchronize. The way this is fixed now is by having a global per type+property locked dict of path=>locks which enforces synchronization - no more deserialization failures now.

@nsiccha nsiccha closed this Apr 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants