A production-ready, enterprise-grade AI agent platform built on Amazon EKS using Kagent, featuring comprehensive observability, intelligent gateway routing, and multi-agent orchestration.
This project demonstrates a complete AI agent platform with:
- Multiple agent patterns - Simple agents, K8s operators, multi-tool agents, and multi-agent collaboration
- Production observability - LLM tracing, distributed tracing, cost tracking, and infrastructure metrics
- Intelligent gateway - Rate limiting, caching, fallbacks, and load balancing via LiteLLM
- Real-world use case - Financial services multi-agent system with agent-to-agent (A2A) communication
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Agent Platform β
β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β Simple Agent β β K8s Ops β β Multi-Tool β β
β β β β Agent β β Agent β β
β ββββββββ¬ββββββββ ββββββββ¬ββββββββ ββββββββ¬ββββββββ β
β β β β β
β ββββββββββββββββββββΌβββββββββββββββββββ β
β β β
β ββββββββββββββββββββΌβββββββββββββββββββ β
β β Financial Services Multi-Agent β β
β β β β
β β ββββββββββββββ ββββββββββββββ β β
β β β Portfolio β β Risk β β β
β β β Analyst β β Assessment β β β
β β βββββββ¬βββββββ βββββββ¬βββββββ β β
β β β β β β
β β ββββββββββ¬ββββββββ β β
β β β β β
β β βββββββββΌβββββββββ β β
β β β Financial β β β
β β β Advisor β β β
β β β (Orchestrator) β β β
β β ββββββββββββββββββ β β
β βββββββββββββββββββββββββββββββββββββββ β
β β β
ββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββ
β
ββββββββββββββΌβββββββββββββ
β LiteLLM Gateway β
β - Rate Limiting β
β - Caching (Redis) β
β - Fallbacks β
β - Cost Tracking β
ββββββββββββββ¬βββββββββββββ
β
ββββββββββββββΌβββββββββββββ
β Amazon Bedrock β
β Claude 3.5 Sonnet β
βββββββββββββββββββββββββββ
Observability Stack
ββββββββββββββββ¬βββββββββββββββ¬βββββββββββββββ
β β β β
ββββββΌβββββ ββββββΌβββββ ββββββΌβββββ ββββββΌβββββ
βLangfuse β β Jaeger β βPrometheusβ β Grafana β
βLLM Traceβ βDist.Tracβ β Metrics β β Viz β
βββββββββββ βββββββββββ ββββββββββββ βββββββββββ
Basic agent demonstrating core Kagent functionality with Bedrock integration.
Kubernetes-aware agent that can query and manage cluster resources.
Smart assistant with multiple capabilities via MCP (Model Context Protocol):
- Calculator - Mathematical computations
- Web Search - Real-time information retrieval
- Weather - Current weather data
- DateTime - Timezone-aware date/time operations
Production-ready multi-agent system demonstrating agent-to-agent (A2A) collaboration:
Specialist Agents:
- Portfolio Analyst - Portfolio valuation and analysis
- Risk Assessment - Risk evaluation and compliance
- Market Data - Real-time market information
Orchestrator:
- Financial Advisor - Coordinates specialists to provide comprehensive financial advice
Example Interaction:
User: "I have 100 AAPL and 50 GOOGL shares. Is my portfolio balanced?"
Financial Advisor (Orchestrator)
βββ Portfolio Analyst: Calculate total value
βββ Risk Assessment: Evaluate risk profile
βββ Market Data: Get current prices
βββ Synthesizes response with actionable advice
Intelligent proxy for LLM requests with enterprise features:
- β Rate Limiting - 100 RPM, 100K TPM (configurable per agent)
- β Caching - Redis-backed response caching (1-hour TTL)
- β Fallbacks - Claude Sonnet β Claude Haiku on failures
- β Load Balancing - Distribute across multiple model instances
- β Cost Tracking - Real-time token usage and cost monitoring
LLM-specific observability platform:
- π Trace every LLM call - Prompts, completions, tokens, costs
- π° Cost analytics - Per-agent, per-model, per-request
- π Debug conversations - Full context and tool calls
- π Usage trends - Token consumption over time
Distributed tracing for agent interactions:
- π Agent-to-agent traces - A2A communication flows
- β±οΈ Latency analysis - Identify bottlenecks
- π Request correlation - End-to-end visibility
Infrastructure and application metrics:
- π Kagent controller metrics - Reconciliation rates, errors
- π₯οΈ Resource usage - CPU, memory, network per agent
- π¨ Alerting - High error rates, latency spikes
- Amazon EKS cluster (1.28+)
- kubectl configured
- Helm 3.x
- AWS credentials with Bedrock access
- Podman or Docker (for building custom tools)
# Install Kagent CRDs and operator
cd 00-initial-setup
kubectl apply -f bedrock-key.yaml
kubectl apply -f litellm-config.yaml
kubectl apply -f litellm-deploy.yaml
# Install Kagent via Helm
helm install kagent-crds oci://public.ecr.aws/kagent-dev/kagent-crds --version 0.7.9 -n kagent --create-namespace
helm install kagent oci://public.ecr.aws/kagent-dev/kagent --version 0.7.9 -n kagent -f values.yamlcd 05-observability/langfuse
# Deploy Langfuse
kubectl apply -f 00-langfuse-secrets.yaml
kubectl apply -f 01-postgres.yaml
kubectl apply -f 02-langfuse-deployment.yaml
# Setup LiteLLM gateway features
./setup-gateway-features.sh
# Deploy Jaeger
kubectl apply -f ../tracing/jaeger.yaml
# Deploy Prometheus ServiceMonitor
kubectl apply -f ../prometheus/kagent-servicemonitor.yaml# Simple agent
kubectl apply -f 01-first-agent/sample-agent.yaml
# K8s ops agent
kubectl apply -f 02-k8s-ops-agent/k8s-ops-agent.yaml
# Multi-tool agent
cd 03-multi-tool-agent
./deploy.sh
# Financial services multi-agent
cd 04-multi-agents/financial-services
./deploy.sh# Kagent UI
kubectl port-forward -n kagent svc/kagent-ui 8080:8080
# Langfuse (LLM tracing & costs)
kubectl port-forward -n langfuse svc/langfuse 3000:3000
# Jaeger (distributed tracing)
kubectl port-forward -n jaeger svc/jaeger 16686:16686
# Grafana (metrics)
kubectl port-forward -n monitoring svc/kube-prom-stack-grafana 3001:80- Open http://localhost:3000
- Navigate to Traces
- See every LLM call with:
- Input/output tokens
- Cost per request
- Latency
- Model used
- Cache hits (shows $0 cost)
- Open http://localhost:16686
- Select service (e.g.,
financial-advisor) - See distributed traces showing:
- Agent-to-agent calls
- Tool invocations
- End-to-end latency
- Open http://localhost:3001 (admin/prom-operator)
- Explore dashboards for:
- Kagent controller operations
- Agent resource usage
- Request rates and errors
Agents can call other agents as tools, enabling:
- Specialization - Each agent focuses on specific domain
- Orchestration - Coordinator agents delegate to specialists
- Scalability - Add new specialists without changing orchestrator
Standardized way for agents to access tools:
- RemoteMCPServer - Tools running as separate services
- Tool Discovery - Agents discover available tools dynamically
- Streaming - Real-time tool responses
LiteLLM acts as intelligent gateway:
- Single endpoint - All agents use same LLM endpoint
- Centralized control - Rate limits, caching, fallbacks
- Observability - Every request traced to Langfuse
Edit 05-observability/langfuse/litellm-advanced-config.yaml:
litellm_settings:
rpm_limit: 100 # Requests per minute
tpm_limit: 100000 # Tokens per minutelitellm_settings:
cache: true
cache_params:
ttl: 3600 # Cache duration in secondsrouter_settings:
fallbacks:
- bedrock-claude-3-5-sonnet: [bedrock-claude-3-haiku]- LLM Cost - Track spend per agent in Langfuse
- Cache Hit Rate - Target >30% for cost savings
- Error Rate - Alert if >5% in Prometheus
- Latency - P95 should be <5s for good UX
- Enable caching - Saves on repeated queries
- Use fallbacks - Haiku is 10x cheaper than Sonnet
- Set budgets - Prevent runaway costs
- Monitor in Langfuse - Identify expensive agents
This is a reference implementation. Feel free to:
- Add new agent examples
- Enhance observability dashboards
- Improve documentation
- Share your use cases
- Langfuse Setup -
05-observability/langfuse/INSTALL.md - LiteLLM Gateway Features -
05-observability/langfuse/LITELLM-GATEWAY-FEATURES.md - Multi-Agent System -
04-multi-agents/financial-services/README.md
This project is provided as-is for educational and reference purposes.
Built with β€οΈ using Kagent, Amazon EKS, and Amazon Bedrock