Skip to content

Caches warp-to-torch views to deduplicate conversions per step#5199

Open
hougantc-nvda wants to merge 4 commits intoisaac-sim:developfrom
hougantc-nvda:hougantc/cache-warp-torch
Open

Caches warp-to-torch views to deduplicate conversions per step#5199
hougantc-nvda wants to merge 4 commits intoisaac-sim:developfrom
hougantc-nvda:hougantc/cache-warp-torch

Conversation

@hougantc-nvda
Copy link
Copy Markdown
Contributor

@hougantc-nvda hougantc-nvda commented Apr 7, 2026

Adds warp_to_torch(owner, field_name) utility that caches wp.to_torch() results using the Warp array's memory pointer as the cache key. Since wp.to_torch() returns a zero-copy view sharing the same GPU buffer, in-place updates are automatically reflected in the cached tensor without invalidation. A new conversion only occurs when the underlying buffer is reallocated (pointer change). This eliminates hundreds of redundant wp.to_torch() calls per frame in observation, reward, and termination functions.

Caching is enabled by default and can be toggled via SimulationCfg.enable_warp_torch_cache.

Migrate all built-in and task-specific MDP terms in isaaclab.envs.mdp and isaaclab_tasks to use warp_to_torch instead of raw wp.to_torch.

Description

Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context.
List any dependencies that are required for this change.

Fixes # (issue)

Type of change

  • New feature (non-breaking change which adds functionality)

Screenshots

Please attach before and after screenshots of the change if applicable.

Checklist

  • I have read and understood the contribution guidelines
  • I have run the pre-commit checks with ./isaaclab.sh --format
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • I have updated the changelog and the corresponding version in the extension's config/extension.toml file
  • I have added my name to the CONTRIBUTORS.md or my name already exists there

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 7, 2026

Greptile Summary

This PR introduces a warp_to_torch caching utility that deduplicates wp.to_torch() calls per simulation step, keyed by _sim_timestamp (for physics assets with PhysX/Newton backends) or by Warp buffer pointer (sensor fallback). All built-in MDP terms and a wide set of task-specific observation/reward/termination functions are migrated to use it.

The caching design is sound: wp.to_torch() produces zero-copy views that share memory with Warp buffers, so both cached and non-cached paths always reflect the latest physics state. The timestamp-based invalidation correctly fires each physics substep via the existing _sim_timestamp increment in the data classes.

Confidence Score: 4/5

Safe to merge with minor improvements; the caching strategy is correctness-sound since wp.to_torch() views share Warp buffer memory and always reflect current physics state.

Score of 4 (not 5) because: (1) no tests are included for the new warp_torch_cache module despite being a non-trivial shared-state mechanism, (2) the getattr-before-cache-check inefficiency partially defeats the optimization for computed properties, and (3) the module-level global could cause subtle test-ordering issues. None of these are blocking correctness bugs, but they are real quality concerns the developer should address.

source/isaaclab/isaaclab/utils/warp_torch_cache.py — contains the three P2 issues (getattr ordering, global state, object.setattr fragility)

Important Files Changed

Filename Overview
source/isaaclab/isaaclab/utils/warp_torch_cache.py New caching module; correct overall but has a getattr-before-cache-check inefficiency and a module-level global that is hard to reset in tests
source/isaaclab/isaaclab/sim/simulation_cfg.py Adds enable_warp_torch_cache flag with correct default (True), type annotation, and docstring
source/isaaclab/isaaclab/sim/simulation_context.py Calls set_caching_enabled() during init based on SimulationCfg; plumbing is correct
source/isaaclab/isaaclab/utils/init.pyi Exports warp_to_torch and set_caching_enabled via lazy_loader stub correctly
source/isaaclab/isaaclab/envs/mdp/observations.py Migrates all wp.to_torch() calls; body_pose_w clone guard already handles aliasing when body_ids is a slice
source/isaaclab/isaaclab/envs/mdp/rewards.py Migrates all wp.to_torch() calls; all operations produce new tensors, no in-place mutation risk
source/isaaclab/isaaclab/envs/mdp/terminations.py Migrates wp.to_torch() calls to warp_to_torch(); correct usage throughout
source/isaaclab/isaaclab/managers/observation_manager.py Minor supporting update; no structural changes to observation computation
source/isaaclab/docs/CHANGELOG.rst New 4.5.26 heading with correct RST formatting, matches extension.toml version
source/isaaclab_tasks/docs/CHANGELOG.rst New 1.5.19 heading with correct RST formatting, matches extension.toml version
source/isaaclab_tasks/isaaclab_tasks/manager_based/manipulation/pick_place/mdp/terminations.py Migrates wp.to_torch() calls; indexing with [:, 0] on cached tensor returns a view but torch.abs() and comparisons produce new tensors so no mutation issue
source/isaaclab_tasks/isaaclab_tasks/manager_based/manipulation/pick_place/mdp/observations.py Correct reuse of body_pos_w cached view within same function call; arithmetic produces new tensors

Sequence Diagram

sequenceDiagram
    participant Caller as Observation/Reward Term
    participant WTC as warp_to_torch()
    participant Owner as asset.data
    participant Cache as _warp_torch_cache dict
    participant Warp as wp.to_torch()

    Caller->>WTC: warp_to_torch(asset.data, "root_pos_w")
    WTC->>Owner: getattr(owner, field_name)
    Owner-->>WTC: warp_array
    WTC->>Cache: cache.get(field_name)
    alt Cache miss OR timestamp changed
        Cache-->>WTC: None / stale entry
        WTC->>Warp: wp.to_torch(warp_array)
        Warp-->>WTC: tensor view (zero-copy)
        WTC->>Cache: store (sim_ts, tensor)
    else Cache hit
        Cache-->>WTC: (sim_ts, cached_tensor)
    end
    WTC-->>Caller: tensor (shared memory with Warp buffer)

    Note over Warp,Cache: Per-step: sim_ts advances → cache miss → new view
    Note over Warp,Cache: Sensors (no _sim_timestamp): ptr check, effectively cache-forever if buffer is fixed
Loading

Reviews (1): Last reviewed commit: "Cache warp-to-torch views to deduplicate..." | Re-trigger Greptile

@hougantc-nvda
Copy link
Copy Markdown
Contributor Author

hougantc-nvda commented Apr 8, 2026

The environments_training tests are failing due to existing failures:

    "success": false,
    "msg": "duration above threshold: 377.5775 > 300"

isaaclab [1/3] is failing due to VideoRecorder, but appears to be fixed in PR 5207:

isaaclab [2/3] seems to be failing previous to this PR as evidenced in job

similarly
isaaclab [3/3] seems to be failing previous to this PR as evidenced in job

@AntoineRichard
Copy link
Copy Markdown
Collaborator

Thanks @hougantc-nvda! I'll try to review this by end of week!

@hougantc-nvda hougantc-nvda requested a review from rwiltz April 8, 2026 15:28
Copy link
Copy Markdown

@isaaclab-review-bot isaaclab-review-bot bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: Caches warp-to-torch views to deduplicate conversions per step

Good optimization overall — eliminating redundant wp.to_torch() calls per step is worthwhile for hot-path performance, and the pointer-based cache key is the correct invariant to exploit since wp.to_torch returns zero-copy views.

Architecture ✅

The approach is sound:

  • Pointer-identity keying is the right primitive. Since wp.to_torch() creates a zero-copy view sharing the same GPU buffer, in-place Warp writes are automatically visible through the cached tensor. A new conversion is only needed when the buffer is reallocated (pointer change). This avoids the entire class of stale-cache bugs that would arise with timestamp-based or step-count-based invalidation.
  • SimulationCfg.enable_warp_torch_cache is a clean escape hatch for debugging.
  • Attaching the cache dict to the owner (_warp_torch_cache) keeps locality per data object, avoiding a global dict keyed on object id which would leak memory.

Implementation — Issues Found

1. Redundant getattr on every call even on cache hit (minor perf)

In warp_to_torch(), warp_array = getattr(owner, field_name) is evaluated unconditionally before checking the cache. For properties that trigger lazy evaluation (e.g. body_link_pose_w), the getter fires on every call even when the cache entry is valid and the result is immediately discarded. This partially undermines the purpose of the cache.

Suggested fix — check cache entry first, then only getattr on miss:

entry = cache.get(field_name)
if entry is not None:
    warp_array = getattr(owner, field_name)
    if entry[0] == int(warp_array.ptr):
        return entry[1]
else:
    warp_array = getattr(owner, field_name)

new_tensor = wp.to_torch(warp_array)
cache[field_name] = (int(warp_array.ptr), new_tensor)
return new_tensor

However, I see the existing review thread from greptile already flagged this (suggesting a timestamp-based approach to avoid the getattr entirely). The pointer check does require the getattr since you need the current pointer value. If the properties are cheap (as noted, ArticulationData has its own internal lazy cache), this is acceptable for V1. Just be aware this is the main residual overhead.

2. Thread safety of module-level _CACHING_ENABLED

The global _CACHING_ENABLED is set by SimulationContext.__init__(). In unit tests that create/destroy multiple contexts, or in multi-threaded scenarios, this could be silently poisoned. Not a blocker for merge, but consider documenting that set_caching_enabled() is process-global and not thread-safe.

3. object.__setattr__ and __slots__ fragility

The try/except on lines 56-58 already handles the AttributeError case for __slots__, so this is properly guarded. Good.

4. Cache growth is unbounded per owner

Each unique field_name accessed on an owner adds an entry to _warp_torch_cache. For typical usage (dozens of fields per data object), this is fine. But there is no eviction — if someone dynamically generates field names, the dict will grow indefinitely. Not a real concern for the current usage patterns, but worth a docstring note.

Correctness — Potential Silent Failure Modes

5. Cache correctness depends on Warp's reallocation semantics ⚠️

The cache assumes that when a Warp array is reallocated (e.g. on environment reset or resize), its .ptr changes. This is true for Warp's current implementation where wp.array allocations always produce fresh device memory pointers. If Warp ever introduces a memory pool that reuses freed pointers (like CUDA's cudaMallocAsync with pool semantics), the same pointer could be returned for a different allocation, causing the cache to serve a stale tensor.

This is a low-probability risk but worth calling out in the docstring as a design assumption: "Relies on Warp's guarantee that reallocated arrays produce distinct ptr values."

6. Passing non-Warp-backed attributes

If getattr(owner, field_name) returns something that isn't a wp.array, the code will crash on warp_array.ptr with an AttributeError. This could happen if someone passes a plain tensor attribute name. Consider adding a type check or at minimum documenting that field_name must resolve to a wp.array.

Observation Manager Micro-optimization ✅

The pre-resolution of _NoiseCfg and _NoiseModelCfg in compute_group() to avoid repeated lazy-module __getattr__ lookups is a nice touch. Correct and low-risk.

Test Coverage ❌

The PR checklist has "I have added tests" unchecked. For a caching layer that silently serves cached tensors on a hot path, this is concerning. At minimum, I'd want:

  1. Cache hit test — call warp_to_torch twice on the same owner+field, verify is identity (same tensor object).
  2. Cache invalidation test — reassign the Warp array on the owner (simulating buffer reallocation), verify a fresh tensor is returned with new data.
  3. Caching disabled test — call set_caching_enabled(False), verify warp_to_torch returns a fresh tensor each time.
  4. Non-Warp attribute test — verify graceful error handling.

Minor Nits

  • The __init__.pyi stub exports warp_to_torch and set_caching_enabled, but the corresponding __init__.py (not shown in diff) may also need updating for runtime imports.
  • The changelog entries are well-written.

Summary

Aspect Rating
Architecture ✅ Sound — pointer-identity cache is correct
Implementation ⚠️ Minor: redundant getattr on hit path
Correctness ✅ Safe under current Warp semantics
Risk Low — graceful fallback exists
Tests ❌ Missing — should add before merge

Recommendation: Approve with minor changes — add unit tests for the cache layer before merge.

Add warp_to_torch(owner, field_name) utility that caches wp.to_torch()
results per simulation step, keyed by _sim_timestamp or buffer pointer.
This eliminates hundreds of redundant wp.to_torch() calls per frame in
observation, reward, and termination functions.

Caching is enabled by default and can be toggled via
SimulationCfg.enable_warp_torch_cache.

Migrate all built-in and task-specific MDP terms in isaaclab.envs.mdp
and isaaclab_tasks to use warp_to_torch instead of raw wp.to_torch.
…dation

Replace the dual-strategy cache (sim_timestamp primary, buffer pointer
fallback) with a unified pointer-based approach.  Since wp.to_torch()
returns a zero-copy view sharing the same GPU memory, in-place updates
are automatically visible through the cached tensor—no invalidation
needed.  A new conversion only occurs when the Warp array is reallocated
to a different buffer (pointer change).

This removes the dependency on _sim_timestamp, making the cache more
general (works outside a simulation loop) and eliminating redundant
wp.to_torch() calls that the timestamp approach forced once per field
per step.  Also removes the unused single-argument calling convention.
@hougantc-nvda hougantc-nvda force-pushed the hougantc/cache-warp-torch branch from c81066b to 304b0b9 Compare April 8, 2026 22:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

isaac-lab Related to Isaac Lab team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants