[EXPORTER] Handle OTLP partial success response#4104
Conversation
| arena.get()); | ||
| grpc::Status status = OtlpGrpcClient::DelegateExport( | ||
| trace_service_stub_.get(), std::move(context), std::move(arena), request, response); | ||
| grpc::Status status = trace_service_stub_->Export(context.get(), *request, response); |
There was a problem hiding this comment.
Thanks for taking a look!
Got it. Updated DelegateExport to take an optional callback that runs before the arena destructs, and passed a callback that logs partial_success / success.
| arena.get()); | ||
| grpc::Status status = OtlpGrpcClient::DelegateExport( | ||
| log_service_stub_.get(), std::move(context), std::move(arena), request, response); | ||
| grpc::Status status = log_service_stub_->Export(context.get(), *request, response); |
There was a problem hiding this comment.
The same as otlp_grpc_exporter.cc:186
|
|
||
| sdk::common::ExportResult OtlpHttpClient::Export( | ||
| const google::protobuf::Message &message, | ||
| google::protobuf::Message *response, |
There was a problem hiding this comment.
I noticed the callback currently captures shared_ptr of response but the method signature also passes pointer to response separately. It might be cleaner to package these together—perhaps as a CompletionContext—so the lifecycle is managed in one place.
And we should use arena to create response to reduce memory segment.
There was a problem hiding this comment.
Good idea, updated to move this into an arena in HttpSessionData.
| http_client_->Export(*service_request, [metric_count]( | ||
| opentelemetry::sdk::common::ExportResult result) { | ||
|
|
||
| std::shared_ptr<proto::collector::metrics::v1::ExportMetricsServiceResponse> response = |
There was a problem hiding this comment.
We should use arena to create response to reduce memory segment. Maybe we can move arena into session to hold the lifetime, just like it in otlp_grpc_client
| if (result_callback_) | ||
| { | ||
| result_callback_(result); | ||
| result_callback_(result, response_); |
There was a problem hiding this comment.
I think this ordering is risky. response_ is owned by the arena stored in the session data, but ReleaseSession() moves that session data to the cleanup list before the callback reads response_. In async HTTP export, another thread could clean up the session and destroy the arena before this callback uses the response. Can we run the callback while the session still owns the arena, or otherwise keep the arena alive through the callback?
There was a problem hiding this comment.
Good catch, reordered so that the callback runs before the ReleaseSession() to ensure that the response_ is valid when the callback runs.
| // On 2xx, parse the body into the caller-provided typed response | ||
| if (response_ != nullptr && result == sdk::common::ExportResult::kSuccess) | ||
| { | ||
| if (!response_->ParseFromString(body_)) |
There was a problem hiding this comment.
This only handles binary protobuf responses. For OTLP/HTTP JSON mode, the server should send the response as JSON too, so ParseFromString() will fail and the partial_success response will be missed. Can we parse based on options_.content_type, and add a JSON response test as well?
There was a problem hiding this comment.
Thanks for pointing this out! Updated to parse based on the the content_type and added corresponding tests
There was a problem hiding this comment.
Pull request overview
Implements OTLP “partial success” handling across HTTP and gRPC exporters by parsing Export*ServiceResponse and emitting a log that includes the rejected item count and server-provided error message (per spec guidance to not retry on partial success).
Changes:
- Extend OTLP HTTP client/exporters to deserialize successful 2xx responses into typed protobuf responses (binary + JSON) and log partial success details.
- Add partial-success logging to OTLP gRPC exporters (sync + async paths) via an
on_completehook inDelegateExport. - Add test infrastructure (
ScopedTestLogHandler) and expand exporter tests to assert partial-success logging for HTTP/gRPC + JSON/proto.
Reviewed changes
Copilot reviewed 23 out of 23 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
test_common/include/opentelemetry/test_common/sdk/common/scoped_test_log_handler.h |
Adds reusable scoped log handler for tests to capture internal logs. |
sdk/test/metrics/meter_provider_sdk_test.cc |
Switches test to use shared ScopedTestLogHandler helper. |
sdk/test/metrics/CMakeLists.txt |
Links metrics tests with opentelemetry_test_common for the new test helper header. |
sdk/test/metrics/BUILD |
Adds Bazel dependency on //test_common:headers for the new test helper header. |
exporters/otlp/test/otlp_http_metric_exporter_test.cc |
Adds HTTP metric exporter partial-success tests (proto + JSON) and log assertions. |
exporters/otlp/test/otlp_http_log_record_exporter_test.cc |
Adds HTTP log exporter partial-success tests (proto + JSON) and log assertions. |
exporters/otlp/test/otlp_http_exporter_test.cc |
Adds HTTP trace exporter partial-success tests (proto + JSON) and log assertions. |
exporters/otlp/test/otlp_grpc_metric_exporter_test.cc |
Adds gRPC metric exporter partial-success test and log assertions. |
exporters/otlp/test/otlp_grpc_log_record_exporter_test.cc |
Adds gRPC log exporter partial-success test and log assertions. |
exporters/otlp/test/otlp_grpc_exporter_test.cc |
Adds gRPC trace exporter partial-success test and log assertions. |
exporters/otlp/src/otlp_http_metric_exporter.cc |
Allocates response on arena and logs partial success from parsed HTTP response. |
exporters/otlp/src/otlp_http_log_record_exporter.cc |
Allocates response on arena and logs partial success from parsed HTTP response. |
exporters/otlp/src/otlp_http_exporter.cc |
Allocates response on arena and logs partial success from parsed HTTP response. |
exporters/otlp/src/otlp_http_client.cc |
Parses successful HTTP response bodies into a caller-provided protobuf message (JSON/proto) and forwards it to callbacks while keeping arena alive. |
exporters/otlp/src/otlp_grpc_metric_exporter.cc |
Logs partial success for gRPC metrics responses in async and sync code paths. |
exporters/otlp/src/otlp_grpc_log_record_exporter.cc |
Logs partial success for gRPC logs responses in async and sync code paths. |
exporters/otlp/src/otlp_grpc_exporter.cc |
Logs partial success for gRPC trace responses; routes success logging through completion hook. |
exporters/otlp/src/otlp_grpc_client.cc |
Adds an optional on_complete callback to DelegateExport to inspect response on success while keeping arena alive. |
exporters/otlp/include/opentelemetry/exporters/otlp/otlp_http_client.h |
Adds typed-response async export overload and extends session data to retain arena/response lifetime. |
exporters/otlp/include/opentelemetry/exporters/otlp/otlp_grpc_client.h |
Adds optional on_complete callback to sync DelegateExport overloads. |
exporters/otlp/CMakeLists.txt |
Links OTLP exporter tests with opentelemetry_test_common for the shared log handler helper. |
exporters/otlp/BUILD |
Adds protobuf dependency and //test_common:headers deps needed by updated tests. |
CHANGELOG.md |
Adds changelog entry for OTLP partial success handling. |
| { | ||
| if (content_type_ == HttpRequestContentType::kJson) | ||
| { | ||
| if (!google::protobuf::util::JsonStringToMessage(body_, response_).ok()) |
There was a problem hiding this comment.
Can we avoid treating this as a success if parsing the response body fails? Right now we only log the parse error, so the exporter may still continue with an empty response and miss partial_success.
There was a problem hiding this comment.
Yep, updated to set the result as kFailure on parse fail
a0a3f29 to
8b8b3ea
Compare
25b916f to
0cd5bb2
Compare
Fixes #1577
Changes
PR to handle partial success responses from OTLP collector. As per the spec it states explicitly to not retry, the proposed solution is to log the rejection count and message.
For significant contributions please make sure you have completed the following items:
CHANGELOG.mdupdated for non-trivial changes