ProxyMem is a proxy-based memory layer that provides cross-session context persistence for LLM agents. By operating at the proxy layer, ProxyMem captures session data, generates structured summaries via LLM analysis, and enriches future requests with relevant historical context.
ProxyMem operates through three main mechanisms:
- Session Capture: Records session interactions as they pass through the proxy
- Session Analysis: Uses a designated LLM to generate structured summaries after session completion
- Context Injection: Enriches new session requests with relevant historical context from the database
- Persistent Context: Maintain awareness of prior work across sessions
- Project-Scoped: Memory is isolated per user and project
- Automatic Summaries: LLM-generated summaries capture goals, decisions, modified files, and remaining tasks
- Privacy Controls: Configurable redaction patterns and user/client deny lists
To use ProxyMem, you must first enable it globally:
# Enable memory feature
python -m src.core.cli --memory-available
# Enable with default memory on for all sessions
python -m src.core.cli --memory-available --memory-default-enabledOnce memory is globally available, users can control it per-session:
| Command | Description |
|---|---|
!/memory-on |
Enable memory capture for the current session |
!/memory-off |
Disable memory capture for the current session |
!/memory-status |
Show current memory state for the session |
!/memory-requeue |
Force the current session to requeue for summary generation |
| Flag | Description | Default |
|---|---|---|
--memory-available |
Enable the ProxyMem feature globally | false |
--memory-default-enabled |
Enable memory by default for new sessions | false |
--memory-summary-model |
Model for generating summaries (backend:model) |
None |
--memory-context-model |
Model for context retrieval (backend:model) |
None |
--memory-summary-prompt |
Path to custom summary prompt file | None |
--memory-context-prompt |
Path to custom context prompt file | None |
--memory-database-path |
Path to SQLite database | ./var/memory.sqlite3 |
--memory-session-timeout |
Minutes of inactivity before session completion | 30 |
--memory-summarization-delay |
Seconds to wait before summary generation | 120 |
--memory-max-sessions-to-consider |
Max recent sessions to consider for context | 10 |
--memory-retention-days |
Days to retain session summaries | 90 |
--memory-max-context-tokens |
Maximum tokens for injected context | 2000 |
--memory-max-summary-tokens |
Max tokens for summary prompt context | 800 |
--memory-max-transcript-chars |
Max transcript length before chunking | 50000 |
--memory-summary-completion-tokens |
Max completion tokens for summary generation | 10000 |
--memory-context-relevance-threshold |
Minimum relevance score for context | 0.5 |
--memory-max-buffer-size-bytes |
Capture buffer size per session | 10485760 |
--memory-analysis-queue-maxsize |
Analysis queue max size | 100 |
--memory-analysis-timeout |
Summary generation timeout (seconds) | 30 |
--memory-max-concurrent-analyses |
Max concurrent analyses | 4 |
--memory-context-template |
Template for injected context | None |
--memory-single-user-mode |
Use a fixed user ID for all sessions | false |
--memory-fixed-user-id |
Fixed user ID (required when single-user-mode) | None |
--memory-persist-transcript |
Persist transcripts for summaries | false |
--memory-summary-prompt-version |
Summary prompt version tag | v1 |
--memory-summary-schema-version |
Summary schema version tag | v1 |
--memory-require-project-discovery |
Require project discovery for injection | true |
--memory-allow-missing-project |
Allow injection without project root | false |
--memory-project-discovery-mode |
Project discovery mode | any |
All CLI flags can be set via environment variables with the MEMORY_ prefix:
export MEMORY_AVAILABLE=true
export MEMORY_DEFAULT_ENABLED=true
export MEMORY_SUMMARY_MODEL=openai:gpt-4o-mini
export MEMORY_CONTEXT_MODEL=openai:gpt-4o-mini
export MEMORY_DATABASE_PATH=./var/memory.sqlite3
export MEMORY_SESSION_TIMEOUT_MINUTES=30
export MEMORY_RETENTION_DAYS=90
export MEMORY_MAX_CONTEXT_TOKENS=2000
export MEMORY_MAX_SUMMARY_TOKENS=800
export MEMORY_MAX_TRANSCRIPT_CHARS=50000
export MEMORY_SUMMARY_COMPLETION_TOKENS=10000
export MEMORY_CONTEXT_RELEVANCE_THRESHOLD=0.5
export MEMORY_PERSIST_TRANSCRIPT=false
export MEMORY_REQUIRE_PROJECT_DISCOVERY=true
export MEMORY_PROJECT_DISCOVERY_MODE=anyNote: MEMORY_SESSION_TIMEOUT is accepted as a legacy alias for MEMORY_SESSION_TIMEOUT_MINUTES.
Add memory settings to your config.yaml:
memory:
available: true
default_enabled: false
summary_model: "openai:gpt-4o-mini"
context_model: "openai:gpt-4o-mini"
database_path: "./var/memory.sqlite3"
session_timeout_minutes: 30
retention_days: 90
max_context_tokens: 2000
max_sessions_to_consider: 10
context_relevance_threshold: 0.5Configuration values are resolved in the following order (highest priority first):
- CLI arguments
- Environment variables
- Configuration file
- Default values
When memory is enabled for a session:
- User prompts are captured as they pass through the proxy
- Assistant responses are captured after receiving from the backend
- Interactions are buffered in memory (up to configurable limit)
- Metadata (model, tokens, timestamps) is recorded with each interaction
Sessions are marked complete when:
- Timeout: No activity for the configured timeout period (default: 30 minutes)
- Explicit Close: The client closes the connection
Upon completion, the session is queued for background analysis.
The summary generator:
- Builds a transcript from captured interactions
- Applies redaction patterns to remove sensitive data
- Chunks large transcripts if they exceed limits
- Calls the configured summary model with an XML-structured prompt
- Validates the XML response against the schema
- Parses and persists the summary to the database
Generated summaries include:
- Title: One-sentence description
- Scope: High-level area/component description
- Goals: Main objectives of the session
- Key Decisions: Important architectural or design decisions
- Modified Files: Files created, modified, or deleted
- Git Operations: Commits, branches, merges, etc.
- Tests Run: Test executions with pass/fail status
- Errors: Significant errors encountered
- Remaining Tasks: Open or blocked tasks
- Completion Status:
completed,partial, orabandoned
For new sessions with memory enabled:
- Recent session summaries are retrieved for the user and project
- Summaries are scored by relevance to the current prompt
- High-relevance context is formatted and injected
- Context appears as a virtual message before the first user message
Context injection only occurs once per session (on the first request).
- All memory operations require an authenticated
user_id - Summaries are scoped to
user_id(and optionallytenant_id) - Context retrieval only returns summaries for the requesting user
- Project scoping ensures cross-project isolation
Configure regex patterns to redact sensitive data:
memory:
redaction_patterns:
- "(?i)(api[_-]?key|password|secret|token)\\s*[=:]\\s*['\"]?[^\\s'\"]*"
- "Bearer\\s+[A-Za-z0-9_-]+"Redaction is applied:
- Before calling the summary model
- Before persisting summaries
- Before logging any content
Block specific users or clients from memory:
# Via CLI
--memory-disable-user user123 --memory-disable-user user456
--memory-disable-client "Roo Cline"
# Via config file
memory:
disabled_users:
- "user123"
- "user456"
disabled_clients:
- "Roo Cline"For personal deployments, bypass user authentication:
python -m src.core.cli --memory-available --memory-single-user-mode --memory-fixed-user-id "personal"ProxyMem uses the proxy's unified database layer for storage. By default, this is SQLite, but PostgreSQL is also supported for production deployments. See Database Configuration for details.
The schema includes:
session_summaries: Stores all session summary data
- Indexed by
user_id,session_start,project_id - Full XML analysis preserved for debugging
user_project_dirs: Maps user+project_root pairs to stable project IDs
The database is automatically created on first use, and migrations are applied automatically on startup.
- Sessions older than
retention_daysare automatically deleted - Cleanup runs periodically (default: daily)
- No full transcripts are persisted—only structured summaries
Override the default summary prompt:
--memory-summary-prompt ./config/prompts/custom_summary.mdAvailable template variables:
{session_transcript}- Full session transcript{session_id},{user_id},{tenant_id}{project_id},{project_root}{model},{branch},{head_sha}{analysis_timestamp}{summary_schema_version},{summary_prompt_version}{max_tokens}
Override the default context prompt:
--memory-context-prompt ./config/prompts/custom_context.mdAvailable template variables:
{user_prompt}- Current user message{session_summaries}- Formatted historical summaries{user_id},{tenant_id}{project_id},{project_root}{max_tokens}
Check that:
--memory-availableis set- User is not in
disabled_userslist - Client is not in
disabled_clientslist - Project root is discovered (if
require_project_discoveryis true) - User ID is present (unless in single-user mode)
Context may be skipped when:
- No historical sessions exist for the user/project
- No summaries meet the relevance threshold
- Project root is required but not discovered
- Context retrieval times out
Use !/memory-status to check the current state.
If the database becomes corrupted:
- Stop the proxy
- Backup or remove
./var/memory.sqlite3 - Restart—schema will be recreated automatically
The capture buffer has a configurable maximum size (default: 10MB per session). If exceeded, the session is marked as partial and remaining interactions are dropped.
Summary generation runs in a background queue with:
- Configurable queue size (default: 100)
- Per-job timeout (default: 30 seconds)
- Maximum concurrent analyses (default: 4)
Backpressure is applied when the queue is full—sessions may be dropped.
Context retrieval adds latency to the first request of each session. To minimize impact:
- Keep
max_sessions_to_considerreasonable (default: 10) - Use a fast model for context generation
- Set appropriate
context_relevance_threshold
- CLI Parameters - Full CLI reference
- Configuration - Configuration file format
- SSO Authentication - User identity integration