Skip to content

Client timeout + retry can duplicate memcells for identical session/messages payload #274

@rendigua2025-gif

Description

@rendigua2025-gif

Summary

When a client times out on /api/v1/memory/add or /api/v1/memory/flush, the server may continue processing in the background. If the client then retries the same session_id with the identical messages payload, EverOS can create duplicate memcell rows with the same message_ids_json but different memcell_id values.

Those duplicate memcells can later propagate into Markdown and search results.

Environment

  • EverOS version: 1.0.0
  • Runtime: Linux container
  • Memorize mode: chat
  • Memory root: isolated test memory root
  • Sample data: synthetic text only

No real user memory was used.

Why this matters

Clients cannot safely treat timeout as failure. A timed-out request may still generate memcells, Markdown files, cascade updates, and searchable rows. If the client blindly retries the same payload, the memory store can be polluted with duplicate memories.

This is especially risky for agent memory integrations, where callers often retry on HTTP timeout.

Reproduction Outline

Use an isolated memory root and synthetic messages.

  1. Start EverOS server in chat mode.
  2. Send /api/v1/memory/add with a short client timeout.
  3. Send /api/v1/memory/flush with a short client timeout.
  4. Without auditing backend state, retry the same /add payload:
    • same app_id
    • same project_id
    • same session_id
    • same messages
    • same timestamps
  5. Retry /flush.
  6. Inspect SQLite memcell.
  7. Wait for background extraction / cascade to finish.
  8. Inspect Markdown and search results.

Synthetic Payload Shape

Example payload shape:

{
  "session_id": "retry_probe_001",
  "app_id": "test_app",
  "project_id": "retry_audit",
  "messages": [
    {
      "sender_id": "test_user",
      "sender_name": "Test User",
      "role": "user",
      "timestamp": 1780870000000,
      "content": "Remember exactly: retry-token-mu equals audit-mu-772."
    },
    {
      "sender_id": "test_assistant",
      "sender_name": "Test Assistant",
      "role": "assistant",
      "timestamp": 1780870000500,
      "content": "Acknowledged. I will preserve the exact retry audit token."
    }
  ]
}

Observed Behavior

After retrying the same payload, SQLite contained two memcell rows for the same session.

Both rows had the same message_ids_json, but different memcell_id values.

Example observed shape:

memcell_id: mc_<first>
message_ids_json:
  ["m_retry_probe_001_1780870000000_000", "m_retry_probe_001_1780870000500_001"]

memcell_id: mc_<second>
message_ids_json:
  ["m_retry_probe_001_1780870000000_000", "m_retry_probe_001_1780870000500_001"]

After background processing completed:

  • Markdown contained repeated references to the same synthetic marker.
  • Search returned duplicate episode results for the same synthetic marker.

Expected Behavior

EverOS should provide retry-safe behavior for timed-out clients.

Possible acceptable fixes:

  • Support an idempotency_key or client_request_id on /add.
  • Expose request/session processing status so clients can audit before retrying.
  • Prevent duplicate memcell creation for the same app_id, project_id, session_id, track, and message_ids_json.
  • Document safe retry behavior for /add and /flush.

Current Code Observation

The input message IDs appear deterministic for the same session_id, timestamp, and message index.

However, memcell_id is minted independently for each boundary result. The memcell table uses memcell_id as primary key and does not appear to enforce uniqueness on:

app_id + project_id + session_id + track + message_ids_json

The buffer merge path deduplicates by message_id, but once messages have been consumed into a memcell and the buffer is cleared, replaying the same payload can create a new memcell.

Suggested Direction

The safest contract would be API-level idempotency:

  • Client supplies an idempotency_key.
  • EverOS stores key + payload hash + processing status + result.
  • Retrying the same key returns the original result or current processing state.
  • Same key with different payload returns a conflict error.

A lower-level safety net would be to prevent duplicate memcells for the same scoped message_ids_json, but that alone may not fully solve HTTP retry semantics.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions