-
Notifications
You must be signed in to change notification settings - Fork 3.3k
azure-monitor-opentelemetry incorrectly resolves cloud_RoleInstance in AKS #45532
Copy link
Copy link
Open
Labels
Monitor - DistroMonitor OpenTelemetry DistroMonitor OpenTelemetry DistroMonitor - ExporterMonitor OpenTelemetry ExporterMonitor OpenTelemetry Exportercustomer-reportedIssues that are reported by GitHub users external to the Azure organization.Issues that are reported by GitHub users external to the Azure organization.needs-team-attentionWorkflow: This issue needs attention from Azure service team or SDK teamWorkflow: This issue needs attention from Azure service team or SDK teamquestionThe issue doesn't require a change to the product in order to be resolved. Most issues start as thatThe issue doesn't require a change to the product in order to be resolved. Most issues start as that
Metadata
Metadata
Assignees
Labels
Monitor - DistroMonitor OpenTelemetry DistroMonitor OpenTelemetry DistroMonitor - ExporterMonitor OpenTelemetry ExporterMonitor OpenTelemetry Exportercustomer-reportedIssues that are reported by GitHub users external to the Azure organization.Issues that are reported by GitHub users external to the Azure organization.needs-team-attentionWorkflow: This issue needs attention from Azure service team or SDK teamWorkflow: This issue needs attention from Azure service team or SDK teamquestionThe issue doesn't require a change to the product in order to be resolved. Most issues start as thatThe issue doesn't require a change to the product in order to be resolved. Most issues start as that
Describe the bug
When exporting user metrics to Application Insights,
cloud_RoleInstancefield is getting an unexpected value. Attempts to set it to pod name don't work.To Reproduce
Steps to reproduce the behavior:
OTEL_RESOURCE_ATTRIBUTES: service.name=MyService,service.namespace=MyNamespaceorOTEL_SERVICE_NAME=MyService.Expected behavior
Pod name in
cloud_RoleInstancefield.Additional context
Looking at _get_cloud_role_instance() function, the resolution process goes as follows:
ResourceAttributes.SERVICE_INSTANCE_IDResourceAttributes.K8S_POD_NAMEplatform.node()— hostnameThe issue is that something (probably Azure resource detectors) configures the value into a UUID (probably of the VM) and it takes the priority. In AKS, though, VM attributes have very little sense as it is a contanerized environment.
To add more to that, if I manually configure the exporter like so
then the pod name appears. This makes me think that this behavior is an unplanned problem since it works inconsistently in different setups.
Suggestion
I think we should suppress producing
ResourceAttributes.SERVICE_INSTANCE_IDif_is_on_aks()to allow the K8S branch to work.Failed workaround attempts
I tried to use K8S Downward API in order to set the pod name manually into
service.instance.id, like so:but I got some inconsistent behavior, meaning that sometimes it works and sometimes it does not. I could not figure out the pattern, just subsequent runs of the same job may yield different
cloud_RoleInstance. There is a related issue that is very confusingly saying that resource detectors have priority over environment variables, but in this case it is unclear why the behavior is inconsistent and why having this branch for k8s.EDIT: Successful workaround
Similarly to the linked issue, setting
OTEL_EXPERIMENTAL_RESOURCE_DETECTORS=otelresolves both problems:cloud_RoleInstanceOTEL_RESOURCE_ATTRIBUTES