Skip to content

First /add can fail when tiktoken downloads o200k_base at runtime #277

@rendigua2025-gif

Description

@rendigua2025-gif

Summary

The first /add request can fail with HTTP 500 if tokenizer initialization tries to download o200k_base.tiktoken at runtime and that network request fails.

In an isolated restart/cold-start test, the failure happened before any business write was committed. After prewarming the tokenizer cache with tiktoken.get_encoding("o200k_base"), the same test shape worked with a new memory root/session.

Why this matters

This makes a fresh EverOS service depend on runtime network access to an external tokenizer asset during the first memory write. If that download fails, users see a generic /add 500 even though the memory pipeline itself has not started writing data yet.

This is especially surprising for Docker/Linux isolated tests and server deployments where outbound network/TLS behavior may vary.

Observed evidence

First cold /add:

/add -> HTTP 500

Failure occurred before writes:

unprocessed_buffer: 0
memcell: 0
conversation_status: 0
md_change_state: 0
Markdown marker: none
search result count: 0

Traceback shape:

tiktoken_ext/openai_public.py: o200k_base
tiktoken/load.py: read_file_cached
requests.get("https://openaipublic.blob.core.windows.net/encodings/o200k_base.tiktoken")
SSLError: UNEXPECTED_EOF_WHILE_READING

After tokenizer prewarm:

tiktoken.get_encoding("o200k_base")

A new memory root/session completed successfully, and restart recovery worked as expected.

Expected behavior

EverOS could make this cold-start dependency more explicit and reliable. Possible options:

  • Prewarm tokenizer assets during everos init or server startup.
  • Document that first use may download tokenizer assets and how to pre-cache them.
  • Fail startup or preflight checks with a clear tokenizer/cache error instead of returning a generic /add 500.
  • Avoid runtime network fetches in the request path where possible.
  • Provide an offline/cache configuration path for Docker/server deployments.

Environment

EverOS: 1.0.0 source checkout
Runtime: Docker Linux runtime
Python: 3.12
Data: synthetic sample only

No real user memory or secrets were used.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions