Skip to content

[POC] Add export group failover and Kafka DLQ replay routing#337

Draft
cleaton wants to merge 2 commits into
rotel-dev:mainfrom
cleaton:fallback
Draft

[POC] Add export group failover and Kafka DLQ replay routing#337
cleaton wants to merge 2 commits into
rotel-dev:mainfrom
cleaton:fallback

Conversation

@cleaton
Copy link
Copy Markdown

@cleaton cleaton commented May 3, 2026

Summary

Background issue: #336

Adds export groups so Rotel can try exporters in priority order, e.g. Clickhouse first, then Kafka as a local fallback/DLQ. The goal is to keep the happy path fast and cheap by sending directly to Clickhouse, while still preserving data when Clickhouse is unavailable.

This also adds optional receiver target exporters so a Kafka DLQ consumer can replay back to Clickhouse in the same Rotel instance without routing back through the export group and creating a Kafka loop.

How it works

  • Pipelines can target an export_group instead of a single exporter.
  • The export group retries a failed batch on the next exporter in priority order.
  • A circuit breaker skips exporters that are already known to be unhealthy, mainly as a latency optimization.
  • Kafka receivers can optionally override their target exporters per telemetry type.
  • For DLQ replay, the Kafka receiver can target a Clickhouse exporter directly while OTLP ingest still targets export_group(clickhouse, kafka).

Example flow:

OTLP ingest -> export_group(clickhouse_live, kafka_dlq)
Kafka DLQ consumer -> clickhouse_replay

@cleaton cleaton changed the title Add export group failover and Kafka DLQ replay routing [POC] Add export group failover and Kafka DLQ replay routing May 3, 2026
@cleaton cleaton changed the title [POC] Add export group failover and Kafka DLQ replay routing [POC] Add export group failover and Kafka DLQ replay routing (resolve issue #336) May 3, 2026
@cleaton cleaton changed the title [POC] Add export group failover and Kafka DLQ replay routing (resolve issue #336) [POC] Add export group failover and Kafka DLQ replay routing May 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant