Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions public/images/kagent-hitl.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
265 changes: 157 additions & 108 deletions public/sitemap.xml

Large diffs are not rendered by default.

95 changes: 53 additions & 42 deletions src/app/docs/kagent/concepts/agent-memory/page.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -12,93 +12,104 @@ export const metadata = {

# Agent Memory

kagent provides long-term memory for agents using vector similarity search. Agents can automatically save and retrieve relevant context across conversations.
With agent memory, your agents can automatically save and retrieve relevant context across conversations using vector similarity search. Memory is built on top of the Google ADK memory implementation.

## Overview

Memory in kagent is:
- **Vector-backed** — Uses embedding models to encode memories as 768-dimensional vectors
- **Searchable** — Retrieves relevant memories via cosine similarity
- **Automatic** — Extracts and saves memories periodically without explicit user action
- **Time-bounded** — Memories expire after a configurable TTL (default 15 days)
Agent memory provides the following capabilities.

## Supported Storage Backends

| Backend | Description |
|---------|-------------|
| **pgvector** (PostgreSQL) | Full-featured vector search using the pgvector extension |
| **Turso/libSQL** (SQLite) | Lightweight alternative using SQLite-compatible storage |
- **Vector-backed.** A basic vector store uses embedding models to encode memories as 768-dimensional vectors.
- **Searchable.** Agents retrieve relevant memories via cosine similarity.
- **Automatic.** Agents extract and save user intent, key learnings, and preferences without explicit user action.
- **Time-bounded.** Memories expire after a configurable TTL (default 15 days).
- **Shared storage.** Memory uses the same kagent database (PostgreSQL or SQLite/Turso), not a separate database.

## Configuration

### Enable Memory on an Agent

To enable memory, set the `memory` field on the declarative agent spec. The `modelConfig` field references a `ModelConfig` object whose embedding provider generates memory vectors.

You can also configure memory in the kagent UI when you create or edit an agent by selecting an embedding model and setting the memory TTL.

```yaml
apiVersion: kagent.dev/v1alpha2
kind: Agent
metadata:
name: memory-agent
namespace: kagent
spec:
type: Declarative
declarative:
modelConfig: default-model-config
systemMessage: "You are a helpful assistant with long-term memory."
memory:
enabled: true
embeddingProvider:
type: OpenAI
model: text-embedding-3-small
modelConfig: default-model-config # References the embedding provider
```

### Memory with Separate Embedding Provider
### Custom TTL

To change the default memory retention period, set the `ttlDays` field.

```yaml
memory:
enabled: true
embeddingProvider:
type: OpenAI
model: text-embedding-3-small
secretRef:
name: openai-embedding-key
modelConfig: default-model-config
ttlDays: 30 # Memories expire after 30 days instead of the default 15
```

## How Memory Works

### Automatic Save Cycle

1. The agent processes user messages normally
2. Every 5th user message, the agent automatically extracts key information
3. Extracted memories are summarized and encoded as embedding vectors
4. Vectors are stored in the database with metadata and timestamps
1. The agent processes user messages normally.
2. Every 5th user message, the agent automatically extracts key information such as user intent, key learnings, and preferences.
3. The agent summarizes extracted memories and encodes them as embedding vectors.
4. The agent stores the vectors in the database with metadata and timestamps.

### Memory Retrieval (Prefetch)

Before generating a response, the agent:
1. Encodes the current user message as an embedding vector
2. Searches stored memories by cosine similarity
3. Injects the most relevant memories into the agent's context
Before generating a response, the agent performs the following steps.

1. Encodes the current user message as an embedding vector.
2. Searches stored memories by cosine similarity.
3. Injects the most relevant memories into the agent's context.

### Memory Tools

When memory is enabled, three tools are injected into the agent:
When you enable memory, the agent receives three additional tools.

| Tool | Description |
|------|-------------|
| `save_memory` | Explicitly save a piece of information |
| `load_memory` | Search for relevant memories by query |
| `prefetch_memory` | Automatically run before response generation |
| `save_memory` | Explicitly save a piece of information. |
| `load_memory` | Search for relevant memories by query. |
| `prefetch_memory` | Automatically run before response generation. |

You can also instruct the agent to use `save_memory` or `load_memory` explicitly during a conversation.

### Viewing Memories

In the kagent UI, you can view the memories that an agent has saved. This lets you inspect what the agent has learned and retained from past interactions.

## Memory Management via API

The following API endpoints let you manage memories programmatically.

```
GET /api/memories/{agentId} # List memories
DELETE /api/memories/{agentId} # Clear all memories
DELETE /api/memories/{agentId}/{id} # Delete specific memory
POST /api/memories/sessions # Add a memory
POST /api/memories/sessions/batch # Add memories in batch
POST /api/memories/search # Search memories by cosine similarity
GET /api/memories?agent_name=X&user_id=Y # List memories
DELETE /api/memories?agent_name=X&user_id=Y # Clear all memories for an agent
```

## Limitations

- **No per-memory deletion.** You can clear all memories for an agent, but you cannot delete individual memory entries.
- **No cross-agent memory sharing.** Each agent has its own isolated memory store. You cannot share memories across agents.
- **Not pluggable.** Memory is built on the Google ADK memory implementation and cannot be swapped for an alternative memory solution (such as Cognee). However, if an alternative memory solution exposes an MCP server, you can add it as a tool and instruct the agent to use it instead of the built-in memory.

## Technical Details

- Embedding vectors are normalized to 768 dimensions
- Background TTL pruning runs periodically (default retention: 15 days)
- Memory is per-agent — each agent has its own isolated memory store
- Memories include timestamps and source session references
- Embedding vectors are normalized to 768 dimensions.
- Background TTL pruning runs periodically (default retention: 15 days).
- Memories include timestamps and source session references.
63 changes: 46 additions & 17 deletions src/app/docs/kagent/concepts/context-management/page.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -12,45 +12,74 @@ export const metadata = {

# Context Management

Long conversations can exceed LLM context windows. kagent provides **event compaction** to automatically summarize older messages, preserving key information while reducing token count.
Long conversations can exceed LLM context windows. With **event compaction**, you can automatically summarize older messages to preserve key information while reducing token count.

## Configuration

To enable compaction, set the `context.compaction` field on the declarative agent spec.

```yaml
apiVersion: kagent.dev/v1alpha2
kind: Agent
metadata:
name: long-conversation-agent
namespace: kagent
spec:
type: Declarative
declarative:
modelConfig: default-model-config
systemMessage: "You are a helpful agent for extended sessions."
context:
eventsCompaction:
enabled: true
compaction:
compactionInterval: 5 # Compact every 5 user invocations
```

### Configuration Options

| Field | Default | Description |
|-------|---------|-------------|
| `compactionInterval` | `5` | Number of new user invocations before triggering a compaction. |
| `overlapSize` | `2` | Number of preceding invocations to include for context overlap. |
| `eventRetentionSize` | — | Number of most recent events to always retain. |
| `tokenThreshold` | — | Post-invocation token threshold. When the prompt token count meets or exceeds this value, compaction triggers. |
| `summarizer` | — | Optional LLM-based summarizer configuration (see below). |

### Summarizer

By default, compacted events are dropped from the context without summarization. To preserve a summary of compacted events, configure the `summarizer` field.

```yaml
context:
compaction:
compactionInterval: 5
summarizer:
modelConfig: summarizer-model-config # Optional: uses agent's own model if omitted
```

| Field | Description |
|-------|-------------|
| `summarizer.modelConfig` | Name of a `ModelConfig` resource to use for summarization. If omitted, the agent's own model is used. |
| `summarizer.promptTemplate` | Custom prompt template for the summarizer. |

## How It Works

1. As the conversation progresses, events accumulate in the session history
2. When the history approaches the model's context limit, compaction triggers
3. Older events are summarized while preserving:
- Key decisions and outcomes
- Important tool results
- Critical context from earlier in the conversation
4. The agent continues with the compacted history seamlessly
1. As the conversation progresses, events accumulate in the session history.
2. Compaction triggers when one of two conditions is met: the number of new user invocations reaches the `compactionInterval`, or the prompt token count exceeds the `tokenThreshold` (if set).
3. If a `summarizer` is configured, older events are summarized by an LLM. Otherwise, compacted events are dropped from the context.
4. The agent continues with the compacted history seamlessly.

## When to Use

Enable event compaction when:
- Agents handle long-running conversations (debugging sessions, investigations)
- Agents call many tools that generate large outputs
- You want to support extended interactions without hitting context limits
Enable event compaction when your agents meet the following criteria.

- Agents handle long-running conversations (debugging sessions, investigations).
- Agents call many tools that generate large outputs.
- You want to support extended interactions without hitting context limits.

You may not need event compaction for the following scenarios.

You may not need it for:
- Short, single-turn interactions
- Agents with small tool sets that generate compact outputs
- Short, single-turn interactions.
- Agents with small tool sets that generate compact outputs.

## Context Caching Note

Expand Down
37 changes: 21 additions & 16 deletions src/app/docs/kagent/concepts/git-based-skills/page.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -12,15 +12,16 @@ export const metadata = {

# Git-Based Skills

Skills are markdown-based knowledge documents that agents load at startup. They provide domain-specific instructions, best practices, and procedures that guide agent behavior.
Skills are markdown-based knowledge documents that agents load at startup. You can use skills to provide domain-specific instructions, best practices, and procedures that guide agent behavior.

kagent supports two sources for skills:
- **OCI images** — Container images containing skill files (original approach)
- **Git repositories** — Clone skills directly from Git repos
You can load skills from two sources.

- **OCI images.** Container images containing skill files (original approach).
- **Git repositories.** Clone skills directly from Git repos.

## Skill File Format

Each skill is a directory containing a `SKILL.md` file with YAML frontmatter:
Each skill is a directory containing a `SKILL.md` file with YAML frontmatter.

```markdown
---
Expand Down Expand Up @@ -48,6 +49,7 @@ apiVersion: kagent.dev/v1alpha2
kind: Agent
metadata:
name: my-agent
namespace: kagent
spec:
type: Declarative
declarative:
Expand All @@ -61,22 +63,24 @@ spec:

### With Subdirectory

You can specify a subdirectory within the repository to use as the skill root.

```yaml
skills:
gitRefs:
- url: https://github.com/myorg/monorepo.git
ref: main
path: skills/kubernetes
path: skills/kubernetes # Only use this subdirectory
```

### Multiple Sources

Combine Git and OCI skills:
You can combine Git and OCI skills in the same agent.

```yaml
skills:
refs:
- image: ghcr.io/myorg/k8s-skills:latest
- ghcr.io/myorg/k8s-skills:latest
gitRefs:
- url: https://github.com/myorg/skills-repo.git
ref: main
Expand All @@ -94,6 +98,7 @@ apiVersion: v1
kind: Secret
metadata:
name: git-credentials
namespace: kagent
type: Opaque
stringData:
token: ghp_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Expand Down Expand Up @@ -123,15 +128,15 @@ skills:

## How It Works

Under the hood, kagent uses a lightweight init container (~30MB) containing Git and krane tools:
Under the hood, a lightweight init container (~30MB) containing Git and krane tools runs before the agent pod starts.

1. Before the agent pod starts, the `skills-init` container runs
2. It clones each Git repository to the skills volume
3. It also pulls any OCI skill images
4. The agent runtime discovers skills from the mounted volume at startup
1. The `skills-init` container clones each Git repository to the skills volume.
2. The container pulls any OCI skill images.
3. The agent runtime discovers skills from the mounted volume at startup.

## Skill Discovery at Runtime

Once loaded, skills are available through the built-in `SkillsTool`:
- **List skills:** The agent calls the tool with no arguments to see available skills
- **Load skill:** The agent calls the tool with a skill name to get the full content
Once loaded, skills are available through the built-in `SkillsTool`.

- **List skills.** The agent calls the tool with no arguments to see available skills.
- **Load skill.** The agent calls the tool with a skill name to get the full content.
Loading