This document describes the NetData integration for OpsAgent, which leverages NetData's 800+ collectors and battle-tested metrics collection while adding OpsAgent's unique AI-powered auto-remediation capabilities.
┌─────────────────────────────────────────────────────────────────┐
│ NetData Agent (port 19999) │
│ • Collects 800+ metrics (system, apps, databases, etc.) │
│ • Built-in health alerts and ML anomaly detection │
│ • Historical data retention │
└─────────────────────────────────────────────────────────────────┘
│
▼ HTTP API (poll /api/v1/alarms)
┌─────────────────────────────────────────────────────────────────┐
│ OpsAgent │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────────┐ │
│ │ Alert │ │ AI Agent │ │ Auto-Remediation │ │
│ │ Listener │→ │ (OpenCode) │→ │ (safe actions) │ │
│ └──────────────┘ └──────────────┘ └──────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌──────────────┼──────────────┐
▼ ▼ ▼
┌────────────┐ ┌────────────┐ ┌────────────┐
│ Discord │ │ Turso │ │ Dashboard │
│ Alerts │ │ DB │ │ (3001) │
└────────────┘ └────────────┘ └────────────┘
- 800+ Instant Integrations - PostgreSQL, Redis, Nginx, Docker, and more
- ML Anomaly Detection - Built-in unsupervised machine learning
- Battle-Tested Collectors - Optimized C code, not JavaScript
- Historical Data - 1+ year retention, not just real-time
- Keep Your Differentiator - AI auto-remediation is still unique to OpsAgent
# Install NetData and OpsAgent
./bin/opsagent.sh netdata-install
# Start OpsAgent with NetData
./bin/opsagent.sh start-netdata# Clone the repository
git clone https://github.com/sjcotto/opsagent.git
cd opsagent
# Create environment file
cat > .env << EOF
OPENCODE_API_KEY=your-opencode-key
DISCORD_WEBHOOK_URL=your-discord-webhook
TURSO_DATABASE_URL=your-turso-url
TURSO_AUTH_TOKEN=your-turso-token
SERVER_NAME=docker-test
EOF
# Start with Docker Compose
docker compose -f docker-compose.netdata.yml up -d
# Or use the npm script
npm run docker:netdata
# View logs
npm run docker:netdata:logs- NetData Dashboard: http://localhost:19999
- OpsAgent Dashboard: http://localhost:3001
The integration uses a YAML configuration file (config/netdata.yaml):
netdata:
# NetData API endpoint
url: "http://localhost:19999"
# Polling interval (seconds)
pollInterval: 30
# Which alerts to monitor
monitorSeverity: "warning" # Options: warning, critical, all
# Acknowledge NetData alerts after OpsAgent processes them
acknowledgeAlerts: true
# Map NetData severity to OpsAgent severity
severityMapping:
warning: "warning"
critical: "critical"
clear: "resolved"
# Alert name patterns to ignore (regex)
ignoreAlerts:
- "test.*"
- ".*_debug"
# Alert name patterns to force-include
forceAlerts:
- ".*disk_full.*"
- ".*oom.*"
opsagent:
# Auto-execute safe actions
autoRemediate: false
# AI model
model: "kimi-k2.5"
discord:
enabled: true
webhookUrl: "${DISCORD_WEBHOOK_URL}"
notifyOnCritical: true
notifyOnAgentAction: true
dashboard:
enabled: true
port: 3001| Variable | Required | Description |
|---|---|---|
OPENCODE_API_KEY |
Yes | OpenCode API key for AI agent |
DISCORD_WEBHOOK_URL |
No | Discord webhook for notifications |
TURSO_DATABASE_URL |
No | Turso database for multi-server storage |
TURSO_AUTH_TOKEN |
No | Turso authentication token |
SERVER_NAME |
No | Custom server name |
NETDATA_URL |
No | NetData URL (default: http://localhost:19999) |
-
NetData Collects Metrics
- Runs as a separate service on port 19999
- Collects system, application, and service metrics
- Evaluates health alerts based on configuration
-
OpsAgent Polls Alerts
- Every 30 seconds (configurable), queries
http://netdata:19999/api/v1/alarms - Detects new, changed, and cleared alerts
- Filters by severity and ignore patterns
- Every 30 seconds (configurable), queries
-
AI Agent Analyzes
- Sends alert context to OpenCode AI (kimi-k2.5)
- AI determines if auto-remediation is safe
- Recommends actions (kill process, restart service, notify human)
-
Actions Executed
- Safe actions run automatically (if
autoRemediate: true) - Risky actions require human approval via Discord
- All actions logged to Turso database
- Safe actions run automatically (if
-
Notifications Sent
- Discord notifications for critical alerts
- Dashboard updates via WebSocket
- Alert resolution notifications
# Install NetData
./bin/opsagent.sh netdata-install
# Install with custom options
./bin/opsagent.sh netdata-install --port 19999 --user-only
# Check NetData status
./bin/opsagent.sh netdata-status
# View NetData logs
./bin/opsagent.sh netdata-logs
./bin/opsagent.sh netdata-logs 100 # Last 100 lines
# Reload NetData health config
./bin/opsagent.sh netdata-reload
# Show NetData config location
./bin/opsagent.sh netdata-config
# Start OpsAgent with NetData
./bin/opsagent.sh start-netdata
# Run with NetData in foreground (development)
./bin/opsagent.sh run-netdata# Stop OpsAgent
./bin/opsagent.sh stop
# Restart
./bin/opsagent.sh restart
# Check status
./bin/opsagent.sh status
# View logs
./bin/opsagent.sh logs
./bin/opsagent.sh logs-liveWith NetData, you instantly get monitoring for:
- CPU, memory, disk, network
- Load average, I/O wait, temperature
- File descriptors, processes
- PostgreSQL, MySQL, MongoDB, Redis
- Query performance, connections, replication
- Nginx, Apache, HAProxy
- Request rates, response codes, latency
- Docker containers, Kubernetes
- Container resource usage, pod health
- Node.js, Python, Java, Go
- Custom application metrics via StatsD
- Systemd, Cron, Postfix
- System services health
See NetData Integrations for the full list.
You can customize NetData's built-in alerts or create new ones:
# Edit health configuration
sudo ./edit-config health.d/cpu.conf
# Reload health config
sudo netdatacli reload-healthExample custom alert:
alarm: custom_app_errors
on: nginx.requests
lookup: sum -5m unaligned of bad_requests
units: requests
every: 1m
warn: $this > 10
crit: $this > 50
info: High number of bad requests detected
to: sysadmin# Run all tests
bun test
# Run specific test file
bun test tests/collector/netdata.test.ts# Build legacy mode
bun run build
# Build NetData mode
bun run build:netdata# Check if NetData is running
curl http://localhost:19999/api/v1/info
# Check NetData logs
./bin/opsagent.sh netdata-logs
# Restart NetData
sudo systemctl restart netdata- Check the URL in
config/netdata.yaml - Verify NetData is accessible from OpsAgent container (if using Docker)
- Check firewall rules
-
Check NetData has active alerts:
curl http://localhost:19999/api/v1/alarms
-
Verify
monitorSeveritysetting in config -
Check if alerts are being filtered by
ignoreAlerts
If you're currently using the legacy (systeminformation) mode:
-
Install NetData:
./bin/opsagent.sh netdata-install
-
Stop legacy mode:
./bin/opsagent.sh stop
-
Start NetData mode:
./bin/opsagent.sh start-netdata
-
Update your configuration from
config/default.yamltoconfig/netdata.yaml
The docker-compose.netdata.yml includes:
- netdata: NetData agent (port 19999)
- opsagent: OpsAgent with NetData integration (port 3001)
- postgres: PostgreSQL for testing database monitoring
- redis: Redis for testing cache monitoring
- nginx: Nginx for testing web server monitoring
When adding features to the NetData integration:
- Update
src/collector/netdata.tsfor alert collection - Update
src/config/netdata-loader.tsfor configuration - Update
src/index-netdata.tsfor the main logic - Add tests in
tests/collector/netdata.test.ts - Update this documentation
MIT License - see LICENSE file for details.