Skip to content

gh-138862: fix timestamp increment after UUIDv7 counter overflow#146606

Open
picnixz wants to merge 10 commits intopython:mainfrom
picnixz:fix/uuid/v7-overflow-138862
Open

gh-138862: fix timestamp increment after UUIDv7 counter overflow#146606
picnixz wants to merge 10 commits intopython:mainfrom
picnixz:fix/uuid/v7-overflow-138862

Conversation

@picnixz
Copy link
Copy Markdown
Member

@picnixz picnixz commented Mar 29, 2026

Comment thread Lib/test/test_uuid.py Outdated
Comment thread Lib/uuid.py Outdated
Comment thread Lib/uuid.py
Comment thread Lib/uuid.py
Comment thread Lib/uuid.py Outdated
Comment thread Lib/uuid.py Outdated
Comment thread Lib/test/test_uuid.py Outdated
Comment thread Lib/test/test_uuid.py Outdated
Comment thread Lib/uuid.py
@picnixz picnixz marked this pull request as draft March 29, 2026 19:43
@picnixz picnixz marked this pull request as ready for review March 29, 2026 21:41
@picnixz picnixz marked this pull request as draft March 29, 2026 21:41
@picnixz
Copy link
Copy Markdown
Member Author

picnixz commented Mar 29, 2026

I'll need to add a test for multiple overflows to see that I only increment on overflow.

@picnixz picnixz marked this pull request as ready for review April 11, 2026 12:15
@picnixz picnixz requested review from eendebakpt and hugovk April 11, 2026 12:16
@picnixz
Copy link
Copy Markdown
Member Author

picnixz commented Apr 11, 2026

@hugovk @eendebakpt Could you have a look at this? TL;DR: it shouldn't happen in practice because the counter is a randomized 41-bit integer and its max value is $2^{42}-1$ so we won't have an issue. The only possibility where this may be an issue is if the clock only goes backward for some reasons and we keep incrementing the counter for all the UUIDs being collected. Note that we can only hit the counter overflow after $2^{41}$ samples.

From a practical PoV, $2^{41}$ is honestly the limit of physical machines for attack complexities in general in terms of memory (for instance, decoding attacks are no more relevant if you need more than $2^{40}$ bits as it's more a physical limitation rather than a time limitation). So from a security PoV, it does make sense to fix this edge case.

So, for a realistic attack, (or issue) one needs one of the following situations to happen:

  • clock only goes backward or we are never able to catch up the updated timestamps, and more than $2^{41}$ samples are collected to hit the first counter overflow; after the first counter overflow, the current code would update the timestamp more and more in the future, and we would never be able to catch it up.
  • you need to generate $2^{41}$ UUIDs within the same millisecond. Highly improbable (we can generate $10^6$ UUIDs, but not $10^{12}$).

Comment thread Lib/uuid.py
# after a counter overflow. We follow the RFC for in the former
# case. In the latter case, we re-use the already advanced
# timestamp (it was updated when we detected the overflow).
if _last_counter_v7_overflow:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first case from the if-else is the "latter" case in the comments above. Maybe swap the order in the comments.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right :') I changed so much this block that I forgot to update the comment. Thanks!

Comment thread Lib/uuid.py
nanoseconds = time.time_ns()
timestamp_ms = nanoseconds // 1_000_000

if _last_timestamp_v7 is None or timestamp_ms > _last_timestamp_v7:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side note: if we initialize _last_timestamp_v7 to a large negative value (e.g. -2**64), then we can remove the last_timestamp_v7 is None check.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or just -1 and it's ok. You won't have a negative timestamp I think.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could (according to Claude when setting the system time to before 1970). But since the timestamp is generated via a int64_t the value -2**64 should be safe.

@eendebakpt
Copy link
Copy Markdown
Contributor

@picnixz I agree with your analysis. Whether or not to merge this PR (it fixes the bug) or leave as is (this will not happen in practice) I am 50/50 on.

@picnixz
Copy link
Copy Markdown
Member Author

picnixz commented Apr 11, 2026

I don't think we are loosing much speed. We are only adding one comparison every time we are within the same millisecond. But this adds a bit complexity that we will need to replicate in C though not necessarily hard to do. But still, it's a bit annoying I think. I can also document the implementation more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants