Skip to content

test: validate Hermes subagent lineage and orphan export coverage#240

Merged
rapids-bot[bot] merged 2 commits into
NVIDIA:mainfrom
mnajafian-nv:test/hermes-lineage-orphan-parity
Jun 7, 2026
Merged

test: validate Hermes subagent lineage and orphan export coverage#240
rapids-bot[bot] merged 2 commits into
NVIDIA:mainfrom
mnajafian-nv:test/hermes-lineage-orphan-parity

Conversation

@mnajafian-nv

@mnajafian-nv mnajafian-nv commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

Overview

This PR validates Hermes subagent lineage and orphan export coverage by adding exporter-visible session-path tests for the remaining Hermes lineage/orphan scenarios across ATOF and OpenInference.

  • I confirm this contribution is my own work, or I have the right to submit it under this project's license.
  • I searched existing issues and open pull requests, and this does not duplicate existing work.

Details

  • Adds a Hermes orphan-subagent regression that proves a correlated subagent_stop exports exactly one readable orphan mark in ATOF, keeps that mark parented to the active Hermes turn, and attaches it to the OpenInference turn span instead of exporting a duplicate standalone orphan span.
  • Adds a Hermes child-session regression that proves subagent lineage is preserved across both ATOF and OpenInference for the nested child-session path, including correct parent linkage from the child subagent scope back to the parent turn.
  • Refactors the existing Hermes orphan and child-session setup to reuse small scenario drivers, reducing duplication while keeping the assertions in the same Hermes session coverage layer.
  • Keeps the scope test-only and does not change runtime behavior.
  • Closes the remaining explicit Hermes lineage/orphan exporter-proof gap we identified for ATOF and OpenInference without widening into unrelated fallback or plugin work.

Validated with:

  • cargo test -p nemo-relay-cli hermes_orphan_subagent_stop_exports_readable_mark_with_lineage -- --nocapture
  • cargo test -p nemo-relay-cli hermes_orphan_subagent_stop_links_atof_and_openinference_to_turn -- --nocapture
  • cargo test -p nemo-relay-cli hermes_subagent -- --nocapture
  • cargo test -p nemo-relay-cli claude_orphan_subagent_stop_after_closed_turn_does_not_open_null_turn -- --nocapture
  • PATH="$HOME/.local/nemo-relay-tools/bin:$PATH" uv run pre-commit run --all-files

Where should the reviewer start?

Start in crates/cli/tests/coverage/session_tests.rs with hermes_orphan_subagent_stop_links_atof_and_openinference_to_turn and hermes_subagent_child_session_preserves_atof_and_openinference_lineage, then review the small shared Hermes scenario drivers those tests reuse.

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

  • Relates to Hermes observability consistency work.

Summary by CodeRabbit

  • Tests
    • Expanded test coverage for session management and observability lineage tracking.
    • Added integration tests validating lineage across parent/child and orphan subagent scenarios.
    • Added verification for ATIF and OpenInference export behavior to ensure events and spans are attached to the correct sessions/turns.

Signed-off-by: mnajafian-nv <mnajafian@nvidia.com>
@mnajafian-nv mnajafian-nv added this to the 0.4 milestone Jun 7, 2026
@mnajafian-nv mnajafian-nv self-assigned this Jun 7, 2026
@mnajafian-nv mnajafian-nv requested a review from a team as a code owner June 7, 2026 02:01
@mnajafian-nv mnajafian-nv added the Test Test related label Jun 7, 2026
@github-actions github-actions Bot added size:L PR is large lang:rust PR changes/introduces Rust code labels Jun 7, 2026
@coderabbitai

coderabbitai Bot commented Jun 7, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 5a68019f-cc99-49bd-a68d-0b72604e89f5

📥 Commits

Reviewing files that changed from the base of the PR and between 38b8412 and e4b69ae.

📒 Files selected for processing (1)
  • crates/cli/tests/coverage/session_tests.rs
📜 Recent review details
🧰 Additional context used
📓 Path-based instructions (10)
**/*.rs

📄 CodeRabbit inference engine (.agents/skills/add-binding-feature/SKILL.md)

Use snake_case naming convention for Rust identifiers (e.g., nemo_relay_tool_call)

**/*.rs: Any Rust change must run just test-rust
Any Rust change must run cargo fmt --all
Any Rust change must run cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Run cargo fmt --all for all FFI work since it is Rust work
Run just test-rust to validate FFI changes
Run cargo clippy --workspace --all-targets -- -D warnings to enforce strict linting on FFI work

When Rust files changed as part of Go work, also run cargo fmt --all, just test-rust, and cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Run cargo fmt --all when Rust files are changed as part of Node work
Run cargo clippy --workspace --all-targets -- -D warnings when Rust files are changed as part of Node work
Run just test-rust when Rust files are changed as part of Node work

**/*.rs: Run cargo fmt --all to format all Rust code
Run cargo clippy --workspace --all-targets -- -D warnings to enforce all clippy lints as errors

**/*.rs: Run cargo fmt --all when Rust files changed as part of WebAssembly work
Run cargo clippy --workspace --all-targets -- -D warnings when Rust files changed as part of WebAssembly work

**/*.rs: If any Rust code changed, always run just test-rust
If any Rust code changed, also run cargo fmt --all
If any Rust code changed, also run cargo clippy --workspace --all-targets -- -D warnings
Run Rust formatting with cargo fmt --all
Run Rust linting with cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Use cargo fmt for Rust code formatting
Run cargo clippy -- -D warnings to lint Rust code and treat all warnings as errors
Use Rust snake_case naming convention for Rust identifiers
Include SPDX license header in all Rust source files using double-slash comment syntax
Validate Rust code with uv run pre-commit run --all-files to enforce cargo fmt formatting check, cargo clippy lints, and cargo deny aud...

Files:

  • crates/cli/tests/coverage/session_tests.rs
{crates/adaptive/**/*.rs,**/*test*.{rs,py,go,ts,js},**/*adaptive*test*.{rs,py,go,ts,js},docs/plugins/adaptive/**}

📄 CodeRabbit inference engine (.agents/skills/maintain-optimizer/SKILL.md)

Maintain documented and tested validation and report behavior for adaptive surfaces

Files:

  • crates/cli/tests/coverage/session_tests.rs
**/{Cargo.toml,**/*.rs}

📄 CodeRabbit inference engine (.agents/skills/maintain-packaging/SKILL.md)

Maintain consistency between Rust package names in Cargo.toml and their actual usage across the codebase

Files:

  • crates/cli/tests/coverage/session_tests.rs
**/*.{h,hpp,c,cpp,rs}

📄 CodeRabbit inference engine (.agents/skills/maintain-packaging/SKILL.md)

Ensure FFI header and library naming follows consistent conventions across platform-specific builds

Files:

  • crates/cli/tests/coverage/session_tests.rs
**/*.{rs,toml}

📄 CodeRabbit inference engine (.agents/skills/rename-surfaces/SKILL.md)

Update Rust crate names and module prefixes during coordinated rename operations

Files:

  • crates/cli/tests/coverage/session_tests.rs
**/*.{rs,py,js,ts,tsx,jsx,go,sh,toml,yaml,yml,md}

📄 CodeRabbit inference engine (AGENTS.md)

Keep SPDX headers on source, docs, scripts, and configuration files. The project is Apache-2.0.

Files:

  • crates/cli/tests/coverage/session_tests.rs
**/*.{rs,py,go,js,ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Follow binding naming conventions: Rust and Python use snake_case, C FFI exports prefixed nemo_relay_, Go uses PascalCase for public APIs, Node.js uses camelCase.

Files:

  • crates/cli/tests/coverage/session_tests.rs
crates/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

crates/**/*.rs: Keep async behavior on the existing tokio-based model. Bindings should preserve callback and future lifetimes rather than blocking or hiding async work unexpectedly.
Use Json = serde_json::Value in Rust-facing runtime APIs for JSON payload handling.

Files:

  • crates/cli/tests/coverage/session_tests.rs
{crates/**/tests/**,python/tests/**,go/nemo_relay/**/*_test.go}

⚙️ CodeRabbit configuration file

{crates/**/tests/**,python/tests/**,go/nemo_relay/**/*_test.go}: Tests should cover the behavior promised by the changed API surface, including error paths and cross-request isolation where relevant.
Prefer assertions on lifecycle events, scope stacks, middleware ordering, and binding parity over shallow smoke tests.

Files:

  • crates/cli/tests/coverage/session_tests.rs
**

⚙️ CodeRabbit configuration file

**:

AGENTS.md

This file provides guidance to agents, including Claude Code and OpenAI Codex, when working in this repository.

Project Overview

NeMo Relay is a multi-language agent runtime framework for execution scopes, lifecycle events, middleware, plugins, and observability around tool and LLM calls. The core runtime is Rust. Primary supported bindings are Rust, Python, and Node.js. Go, WebAssembly, and the raw C FFI are experimental and source-first.

The shared runtime model is:

  1. Scope stacks decide where work belongs and which scope-local behavior is visible.
  2. Middleware registries decide what guardrails and intercepts run around managed calls.
  3. Plugins install reusable runtime behavior from configuration.
  4. Events record runtime behavior in ATOF form.
  5. Subscribers and exporters consume events in-process or export them to ATIF, OpenTelemetry, OpenInference, or other backends.

Repository Structure

The repository layout separates the Rust runtime, language bindings, documentation,
integration patches, and agent-facing skills.

crates/
  core/       # Rust core runtime crate, published as nemo-relay
  adaptive/   # Adaptive runtime primitives and plugin components
  python/     # PyO3 native extension for the Python package
  ffi/        # Raw C ABI layer used by downstream bindings such as Go
  node/       # NAPI Node.js binding and JavaScript/TypeScript entry points
  wasm/       # wasm-bindgen WebAssembly binding and JS wrappers
python/
  nemo_relay/  # Python wrapper package: scopes, tools, LLM, middleware, typed helpers, plugins, adaptive helpers
  tests/      # Python tests
go/
  nemo_relay/  # Experimental Go CGo binding and tests
fern/         # Fern documentation site
scripts/      # Stable wrappers and helper scripts; build/test/docs entry points live in justfile
third_party/  # P...

Files:

  • crates/cli/tests/coverage/session_tests.rs
🔇 Additional comments (1)
crates/cli/tests/coverage/session_tests.rs (1)

93-109: LGTM!


Walkthrough

Adds ATIF exporter test utilities and session-scoped subscriber plumbing, refactors Hermes payload driving into reusable async helpers, and adds two integration tests validating orphan and parent-child subagent lineage across ATIF and OpenInference exporters.

Changes

Hermes observability lineage coverage

Layer / File(s) Summary
ATIF/OpenInference test plumbing and utilities
crates/cli/tests/coverage/session_tests.rs
Imports ATIF exporter types and HashSet. Adds make_atof_test_exporter, read_atof_events, event_session_id, tracked_sessions, and register_filtered_session_subscriber to build session-filtered ATIF/OpenInference test plumbing.
Hermes session orchestration helpers
crates/cli/tests/coverage/session_tests.rs
Adds async helpers drive_hermes_orphan_subagent_stop and drive_hermes_subagent_child_session that adapt Hermes hook payloads and apply resulting events to the SessionManager for orphan and parent/child lifecycles.
Orphan subagent stop lineage validation
crates/cli/tests/coverage/session_tests.rs
Refactors existing orphan test to use the helper and adds hermes_orphan_subagent_stop_links_atof_and_openinference_to_turn, wiring session-filtered ATIF and OpenInference and asserting ATIF exports a single orphan mark linked to the turn while OpenInference attaches the mark under the turn span.
Child subagent session lineage validation
crates/cli/tests/coverage/session_tests.rs
Refactors child-session test to use the helper and adds hermes_subagent_child_session_preserves_atof_and_openinference_lineage, asserting ATIF child events reference the parent turn UUID and OpenInference exports a single child subagent span with matching parent UUID.

Possibly related PRs

  • NVIDIA/NeMo-Relay#214: Related tests validating nested-vs-fallback subagent lineage and ATIF parent_uuid propagation.
  • NVIDIA/NeMo-Relay#219: Related changes to session_tests.rs adding Hermes→ATIF trajectory assertions and fidelity checks.
  • NVIDIA/NeMo-Relay#235: Related refactors extending OpenInference subscriber wiring and session lifecycle test coverage.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title follows Conventional Commits format with type 'test', clear imperative summary of changes, and is within the 72-character limit (65 chars).
Description check ✅ Passed The PR description covers all required template sections: Overview with confirmation checkboxes, Details explaining the changes, Where should the reviewer start guidance, and Related Issues.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@mnajafian-nv mnajafian-nv changed the title test: validate Hermes subagent lineage and orphan export parity test: validate Hermes subagent lineage and orphan export coverage Jun 7, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/cli/tests/coverage/session_tests.rs`:
- Around line 93-105: The event_session_id function currently inspects both
metadata and payload (data.session_id and data.extra.session_id); change it to
only read session id from metadata
(metadata.get("session_id").and_then(Value::as_str)) so subscriber filtering
matches the ATIF exporter’s contract; remove the data.* fallbacks from
event_session_id and update tests that relied on payload session IDs to instead
set metadata.session_id or, when simulating events that truly lack metadata, use
the explicit test marker HERMES_ROUTED_TEST_SESSION_KEY in metadata so tests
remain deterministic.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 20743dbd-48f0-405e-bb54-e2bacf34ff47

📥 Commits

Reviewing files that changed from the base of the PR and between 2cd62dc and 38b8412.

📒 Files selected for processing (1)
  • crates/cli/tests/coverage/session_tests.rs
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Check / Run
🧰 Additional context used
📓 Path-based instructions (10)
**/*.rs

📄 CodeRabbit inference engine (.agents/skills/add-binding-feature/SKILL.md)

Use snake_case naming convention for Rust identifiers (e.g., nemo_relay_tool_call)

**/*.rs: Any Rust change must run just test-rust
Any Rust change must run cargo fmt --all
Any Rust change must run cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Run cargo fmt --all for all FFI work since it is Rust work
Run just test-rust to validate FFI changes
Run cargo clippy --workspace --all-targets -- -D warnings to enforce strict linting on FFI work

When Rust files changed as part of Go work, also run cargo fmt --all, just test-rust, and cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Run cargo fmt --all when Rust files are changed as part of Node work
Run cargo clippy --workspace --all-targets -- -D warnings when Rust files are changed as part of Node work
Run just test-rust when Rust files are changed as part of Node work

**/*.rs: Run cargo fmt --all to format all Rust code
Run cargo clippy --workspace --all-targets -- -D warnings to enforce all clippy lints as errors

**/*.rs: Run cargo fmt --all when Rust files changed as part of WebAssembly work
Run cargo clippy --workspace --all-targets -- -D warnings when Rust files changed as part of WebAssembly work

**/*.rs: If any Rust code changed, always run just test-rust
If any Rust code changed, also run cargo fmt --all
If any Rust code changed, also run cargo clippy --workspace --all-targets -- -D warnings
Run Rust formatting with cargo fmt --all
Run Rust linting with cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Use cargo fmt for Rust code formatting
Run cargo clippy -- -D warnings to lint Rust code and treat all warnings as errors
Use Rust snake_case naming convention for Rust identifiers
Include SPDX license header in all Rust source files using double-slash comment syntax
Validate Rust code with uv run pre-commit run --all-files to enforce cargo fmt formatting check, cargo clippy lints, and cargo deny aud...

Files:

  • crates/cli/tests/coverage/session_tests.rs
{crates/adaptive/**/*.rs,**/*test*.{rs,py,go,ts,js},**/*adaptive*test*.{rs,py,go,ts,js},docs/plugins/adaptive/**}

📄 CodeRabbit inference engine (.agents/skills/maintain-optimizer/SKILL.md)

Maintain documented and tested validation and report behavior for adaptive surfaces

Files:

  • crates/cli/tests/coverage/session_tests.rs
**/{Cargo.toml,**/*.rs}

📄 CodeRabbit inference engine (.agents/skills/maintain-packaging/SKILL.md)

Maintain consistency between Rust package names in Cargo.toml and their actual usage across the codebase

Files:

  • crates/cli/tests/coverage/session_tests.rs
**/*.{h,hpp,c,cpp,rs}

📄 CodeRabbit inference engine (.agents/skills/maintain-packaging/SKILL.md)

Ensure FFI header and library naming follows consistent conventions across platform-specific builds

Files:

  • crates/cli/tests/coverage/session_tests.rs
**/*.{rs,toml}

📄 CodeRabbit inference engine (.agents/skills/rename-surfaces/SKILL.md)

Update Rust crate names and module prefixes during coordinated rename operations

Files:

  • crates/cli/tests/coverage/session_tests.rs
**/*.{rs,py,js,ts,tsx,jsx,go,sh,toml,yaml,yml,md}

📄 CodeRabbit inference engine (AGENTS.md)

Keep SPDX headers on source, docs, scripts, and configuration files. The project is Apache-2.0.

Files:

  • crates/cli/tests/coverage/session_tests.rs
**/*.{rs,py,go,js,ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Follow binding naming conventions: Rust and Python use snake_case, C FFI exports prefixed nemo_relay_, Go uses PascalCase for public APIs, Node.js uses camelCase.

Files:

  • crates/cli/tests/coverage/session_tests.rs
crates/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

crates/**/*.rs: Keep async behavior on the existing tokio-based model. Bindings should preserve callback and future lifetimes rather than blocking or hiding async work unexpectedly.
Use Json = serde_json::Value in Rust-facing runtime APIs for JSON payload handling.

Files:

  • crates/cli/tests/coverage/session_tests.rs
{crates/**/tests/**,python/tests/**,go/nemo_relay/**/*_test.go}

⚙️ CodeRabbit configuration file

{crates/**/tests/**,python/tests/**,go/nemo_relay/**/*_test.go}: Tests should cover the behavior promised by the changed API surface, including error paths and cross-request isolation where relevant.
Prefer assertions on lifecycle events, scope stacks, middleware ordering, and binding parity over shallow smoke tests.

Files:

  • crates/cli/tests/coverage/session_tests.rs
**

⚙️ CodeRabbit configuration file

**:

AGENTS.md

This file provides guidance to agents, including Claude Code and OpenAI Codex, when working in this repository.

Project Overview

NeMo Relay is a multi-language agent runtime framework for execution scopes, lifecycle events, middleware, plugins, and observability around tool and LLM calls. The core runtime is Rust. Primary supported bindings are Rust, Python, and Node.js. Go, WebAssembly, and the raw C FFI are experimental and source-first.

The shared runtime model is:

  1. Scope stacks decide where work belongs and which scope-local behavior is visible.
  2. Middleware registries decide what guardrails and intercepts run around managed calls.
  3. Plugins install reusable runtime behavior from configuration.
  4. Events record runtime behavior in ATOF form.
  5. Subscribers and exporters consume events in-process or export them to ATIF, OpenTelemetry, OpenInference, or other backends.

Repository Structure

The repository layout separates the Rust runtime, language bindings, documentation,
integration patches, and agent-facing skills.

crates/
  core/       # Rust core runtime crate, published as nemo-relay
  adaptive/   # Adaptive runtime primitives and plugin components
  python/     # PyO3 native extension for the Python package
  ffi/        # Raw C ABI layer used by downstream bindings such as Go
  node/       # NAPI Node.js binding and JavaScript/TypeScript entry points
  wasm/       # wasm-bindgen WebAssembly binding and JS wrappers
python/
  nemo_relay/  # Python wrapper package: scopes, tools, LLM, middleware, typed helpers, plugins, adaptive helpers
  tests/      # Python tests
go/
  nemo_relay/  # Experimental Go CGo binding and tests
fern/         # Fern documentation site
scripts/      # Stable wrappers and helper scripts; build/test/docs entry points live in justfile
third_party/  # P...

Files:

  • crates/cli/tests/coverage/session_tests.rs
🔇 Additional comments (1)
crates/cli/tests/coverage/session_tests.rs (1)

5-8: LGTM!

Also applies to: 14-14, 48-57, 85-91, 107-114, 467-582, 2817-2817, 2844-2933, 2945-2952, 2980-3095

Comment thread crates/cli/tests/coverage/session_tests.rs
Signed-off-by: mnajafian-nv <mnajafian@nvidia.com>
@mnajafian-nv

Copy link
Copy Markdown
Contributor Author

/merge

@rapids-bot rapids-bot Bot merged commit 7e2e10b into NVIDIA:main Jun 7, 2026
31 checks passed
@mnajafian-nv mnajafian-nv deleted the test/hermes-lineage-orphan-parity branch June 7, 2026 02:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lang:rust PR changes/introduces Rust code size:L PR is large Test Test related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants