Skip to content

Add GR00T remote closed-loop policy + base-container support#655

Merged
xyao-nv merged 6 commits into
mainfrom
xyao/dev/gr00t_remote_policy_class
May 6, 2026
Merged

Add GR00T remote closed-loop policy + base-container support#655
xyao-nv merged 6 commits into
mainfrom
xyao/dev/gr00t_remote_policy_class

Conversation

@xyao-nv
Copy link
Copy Markdown
Collaborator

@xyao-nv xyao-nv commented May 6, 2026

Summary

Add GR00T remote closed-loop policy + base-container support

Detailed description

  • Why: Run GR00T closed-loop eval against a remote server from the base Arena container, without dragging in the full GR00T training stack (torch / flash-attn / transformers).
  • What:
  1. New Gr00tRemoteClosedloopPolicy that reuses the local obs/action pipeline but talks to GR00T's PolicyClient over zmq, plus mock-based unit tests
  2. io_utils.load_gr00t_modality_config_from_file falls back to an inlined load_modality_config so it works without tyro
  3. base Dockerfile now installs msgpack/msgpack-numpy/pyzmq and pip install --no-deps -e of the GR00T source.
  • Impact: Remote GR00T eval now works in the base container

TODO

  • Action chunking
  • Pi remote policy
  • Refactor policy cfg class
  • Remove existing gr00t imple

@xyao-nv xyao-nv marked this pull request as ready for review May 6, 2026 05:23
Copy link
Copy Markdown
Contributor

@isaaclab-review-bot isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Isaac Lab Review Bot

Summary

This PR adds a Gr00tRemoteClosedloopPolicy class that delegates inference to a remote GR00T policy server via ZMQ, plus supporting changes to make the base container importable without the full GR00T training stack. The implementation correctly reuses the existing obs/action translation pipeline and includes well-structured unit tests. However, there are several correctness issues around reset semantics, unused variables, and missing num_envs CLI argument handling.

Architecture Impact

  • New policy class integrates with existing PolicyBase and ActionChunkingState — follows established patterns
  • Dockerfile changes affect all base container builds — the --no-deps --ignore-requires-python install is intentional but creates a fragile import path
  • io_utils.py fallback enables GR00T imports without tyro, but the sys.path mutation persists globally
  • No changes to existing local policy — this is additive

Implementation Verdict

Minor fixes needed — one semantic bug in reset handling, one missing CLI argument, and a few code quality issues.

Test Coverage

Good mock-based unit tests covering the happy path (obs structure, action shape), reset propagation, and connection failure. Missing:

  • Test for partial reset (env_ids subset) — the current implementation ignores it
  • Test for get_action path through ActionChunkingState (only _get_action_chunk is tested directly)
  • No integration test, but that's reasonable given the remote dependency

CI Status

No CI checks available yet — cannot verify Dockerfile builds or test execution.

Findings

🔴 Critical: gr00t_remote_closedloop_policy.py:219-222 — reset() ignores env_ids, resets ALL environments

def reset(self, env_ids: torch.Tensor | None = None):
    if env_ids is None:
        env_ids = slice(None)
    self._client.reset()  # <-- Always resets entire server state
    self._chunking_state.reset(env_ids)

The env_ids parameter is accepted but self._client.reset() unconditionally resets the remote server state for all environments, while only _chunking_state respects the partial reset. If env_ids specifies a subset, the server state becomes inconsistent with the local chunking state. Either document that partial reset is unsupported (and raise on non-None env_ids), or don't reset the client when only a subset is specified.

🔴 Critical: gr00t_remote_closedloop_policy.py:140-141 — Missing --num_envs CLI argument

group.add_argument("--remote_host", type=str, default="localhost", ...)
# num_envs is never added to the parser

from_cli_args at line 58 accesses args.num_envs, but add_args_to_parser never adds this argument. This will cause AttributeError when using CLI construction. Add:

group.add_argument("--num_envs", type=int, default=1, help="Number of environments")

🟡 Warning: gr00t_remote_closedloop_policy.py:219-220 — Unused assignment to env_ids

if env_ids is None:
    env_ids = slice(None)  # <-- assigned but never used after this

The assigned env_ids is only passed to _chunking_state.reset(env_ids), but if env_ids was None, the slice assignment happens then it's passed. This works but the slice(None) assignment is misleading — it suggests the code intended to do something with it. Either remove the assignment and pass env_ids directly (letting ActionChunkingState.reset handle None), or use the variable meaningfully.

🟡 Warning: io_utils.py:223-227 — sys.path mutation persists globally

sys.path.append(str(path.parent))
importlib.import_module(path.stem)

Appending to sys.path without cleanup pollutes the import namespace for the entire process. If multiple modality configs exist in different directories with same-named modules, the first wins. Consider using importlib.util.spec_from_file_location and module_from_spec for isolated loading, or at minimum document this limitation.

🟡 Warning: io_utils.py:226 — importlib.import_module(path.stem) may fail on module names with dots
If the filename is my.modality.config.py, path.stem returns my.modality.config, which import_module interprets as a package path. Use importlib.util.spec_from_file_location(path.stem, path) for robustness.

🔵 Improvement: gr00t_remote_closedloop_policy.py:21 — Import at module level may fail in base container

from gr00t.policy.server_client import PolicyClient as Gr00tPolicyClient

The PR description states this should work in the base container, but this top-level import requires gr00t to be importable at module load time. Verify that the --no-deps -e install in the Dockerfile makes this import succeed without optional deps like torch/transformers. If server_client.py itself imports heavy deps at module level, this will fail.

🔵 Improvement: test_gr00t_remote_closedloop_policy.py:96 — _make_action_response missing torso_rpy key
The test comment says EXPECTED_ACTION_DIM = 50 # 43 joints + 3 nav + 1 base height + 3 torso rpy, but _make_action_response only generates joint groups + navigate_command + base_height_command. If build_gr00t_action_tensor expects torso_rpy, the test may be passing by accident (perhaps it's optional). Verify the action translation handles missing keys correctly or add the key to the mock response.

🔵 Improvement: Dockerfile.isaaclab_arena:108 — Pinned versions may conflict with Isaac Sim's environment

RUN /isaac-sim/python.sh -m pip install msgpack==1.1.0 msgpack-numpy==0.4.8 pyzmq==27.0.1

These exact versions may conflict with existing packages in the Isaac Sim base image. Consider using >= constraints or verifying compatibility, especially since Isaac Sim 6.0.0 may have its own zmq/msgpack dependencies.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 6, 2026

Greptile Summary

This PR introduces Gr00tRemoteClosedloopPolicy, which reuses the local obs/action translation pipeline but delegates inference to a remote GR00T PolicyClient over ZMQ, enabling closed-loop evaluation from the base Arena container without requiring the full GR00T training stack.

  • New policy (gr00t_remote_closedloop_policy.py): wraps Gr00tPolicyClient, reuses ActionChunkingState, and exposes the standard PolicyBase interface; the from_args() CLI path reads args.num_envs directly but add_args_to_parser does not register --num_envs, crashing any production invocation through that flow.
  • io_utils.py fallback: replaces the sys.path-mutating importlib.import_module approach with importlib.util.spec_from_file_location, correctly bypassing the module cache and avoiding path mutation; catches ImportError when tyro is absent so the base container can still load modality configs.
  • Dockerfile: unconditionally installs msgpack/pyzmq/msgpack-numpy and pip install --no-deps -e GR00T before the INSTALL_GROOT gate, making gr00t.policy.server_client importable in the base image without pulling in torch or flash-attn.

Confidence Score: 4/5

The new policy is safe for direct-construction usage and the test suite is thorough, but the CLI integration path via from_args() will crash in production because add_args_to_parser does not register --num_envs yet from_cli_args reads it unconditionally.

The from_args() → from_cli_args() path reads args.num_envs directly (line 57) but add_args_to_parser never calls add_argument("--num_envs", ...), so any invocation through the standard CLI flow hits AttributeError. The unit tests avoid this path entirely by constructing Gr00tRemoteClosedloopPolicyArgs directly. All other logic — the ZMQ client connection, obs/action translation, chunking state, and modality-config fallback — looks correct and well-tested.

isaaclab_arena_gr00t/policy/gr00t_remote_closedloop_policy.py — specifically add_args_to_parser and from_cli_args.

Important Files Changed

Filename Overview
docker/Dockerfile.isaaclab_arena Unconditionally installs GR00T with --no-deps before the INSTALL_GROOT gate; adds msgpack/pyzmq for the lightweight remote-client path.
isaaclab_arena_gr00t/policy/gr00t_remote_closedloop_policy.py New remote closed-loop policy delegating inference to a GR00T PolicyClient over ZMQ; CLI path via from_args() has an untested --num_envs gap.
isaaclab_arena_gr00t/tests/test_gr00t_remote_closedloop_policy.py Comprehensive mock-based unit tests covering obs structure, action shape, reset propagation, and ping failure; all tests bypass the CLI path.
isaaclab_arena_gr00t/utils/io_utils.py Fallback modality-config loader using importlib.util.spec_from_file_location avoids sys.path mutation; FileNotFoundError message is still misleading for existing non-.py paths.

Sequence Diagram

sequenceDiagram
    participant Eval as Eval Loop
    participant Policy as Gr00tRemoteClosedloopPolicy
    participant Chunk as ActionChunkingState
    participant Client as Gr00tPolicyClient (ZMQ)
    participant Server as GR00T Policy Server

    Eval->>Policy: __init__(config)
    Policy->>Client: ping()
    Client->>Server: ping request
    Server-->>Client: pong
    Client-->>Policy: True
    Policy-->>Eval: policy ready

    Eval->>Policy: set_task_description(desc)
    Policy-->>Eval: confirmed description

    loop Each step
        Eval->>Policy: get_action(env, observation)
        Policy->>Chunk: get_action(fetch_chunk_fn)
        alt chunk exhausted
            Chunk->>Policy: fetch_chunk_fn()
            Policy->>Policy: extract_obs_numpy_from_torch()
            Policy->>Policy: build_gr00t_policy_observations()
            Policy->>Client: get_action(policy_observations)
            Client->>Server: ZMQ request
            Server-->>Client: grouped joint action dict
            Client-->>Policy: (robot_action_policy, meta)
            Policy->>Policy: build_gr00t_action_tensor()
            Policy-->>Chunk: action_tensor [N, horizon, action_dim]
        end
        Chunk-->>Policy: action [N, action_dim]
        Policy-->>Eval: action tensor
    end

    Eval->>Policy: reset(env_ids)
    Policy->>Client: reset()
    Policy->>Chunk: reset(env_ids)
Loading

Reviews (4): Last reviewed commit: "fix" | Re-trigger Greptile

Comment thread isaaclab_arena_gr00t/policy/gr00t_remote_closedloop_policy.py
Comment thread isaaclab_arena_gr00t/policy/gr00t_remote_closedloop_policy.py
Comment thread isaaclab_arena_gr00t/utils/io_utils.py
Copy link
Copy Markdown
Contributor

@isaaclab-review-bot isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Isaac Lab Review Bot

Summary

This is a follow-up review on commit 0677561. Comparing to the previous review, none of the previous critical or warning findings have been addressed. The diff between commits shows no changes to the files reviewed.

Previous Concerns Still Outstanding

🔴 Critical: reset() still ignores env_ids — gr00t_remote_closedloop_policy.py:213-216 unconditionally calls self._client.reset() regardless of which environments are being reset.

🔴 Critical: --num_envs CLI argument still missingadd_args_to_parser doesn't add it, but from_cli_args expects it at line 58.

🟡 Warning: sys.path mutation in io_utils.py:223 — still persists globally without cleanup.

Implementation Verdict

Significant concerns — the previous critical issues remain unaddressed. The missing --num_envs CLI argument will cause runtime failures for anyone using the CLI interface, and the reset semantics bug could cause subtle state corruption in multi-environment scenarios.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 6, 2026

Want your agent to iterate on Greptile's feedback? Try greploops.

Comment thread isaaclab_arena_gr00t/utils/io_utils.py Outdated
Copy link
Copy Markdown
Contributor

@isaaclab-review-bot isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Isaac Lab Review Bot

Summary

This is a follow-up review on commit 48a5beb. The diff between this commit and the previous review (0677561) shows no changes to the reviewed files. All previous critical concerns remain unaddressed.

Previous Concerns Still Outstanding

🔴 Critical: reset() still ignores env_ids — gr00t_remote_closedloop_policy.py:213-216 unconditionally calls self._client.reset() regardless of which environments are being reset. This breaks the expected semantics when only a subset of environments need reset.

🔴 Critical: --num_envs CLI argument still missingadd_args_to_parser doesn't add --num_envs, but from_cli_args at line 58 expects args.num_envs to exist. This will cause AttributeError for anyone using the CLI.

🟡 Warning: sys.path mutation persists — io_utils.py:223 appends to sys.path without cleanup, which pollutes the module search path globally.

Implementation Verdict

Significant concerns — No changes since last review. The CLI will crash on use, and reset semantics are broken.

Comment thread isaaclab_arena_gr00t/policy/gr00t_remote_closedloop_policy.py
@cvolkcvolk
Copy link
Copy Markdown
Collaborator

Great! Thanks a lot

xyao-nv added 2 commits May 6, 2026 11:25
from_cli_args reads args.num_envs, but the parser never registered
the flag, so any CLI invocation of Gr00tRemoteClosedloopPolicy raised
AttributeError before reaching the server.

Signed-off-by: Xinjie Yao <xyao@nvidia.com>
@xyao-nv xyao-nv enabled auto-merge (squash) May 6, 2026 20:33
@xyao-nv xyao-nv merged commit 3111385 into main May 6, 2026
6 checks passed
pulkitg01 pushed a commit that referenced this pull request May 12, 2026
## Summary
Add GR00T remote closed-loop policy + base-container support

## Detailed description
- Why: Run GR00T closed-loop eval against a remote server from the base
Arena container, without dragging in the full GR00T training stack
(torch / flash-attn / transformers).
- What:
-
1. New `Gr00tRemoteClosedloopPolicy` that reuses the local obs/action
pipeline but talks to GR00T's `PolicyClient` over zmq, plus mock-based
unit tests
2. `io_utils.load_gr00t_modality_config_from_file` falls back to an
inlined `load_modality_config` so it works without tyro
3. base Dockerfile now installs msgpack/msgpack-numpy/pyzmq and pip
install --no-deps -e of the GR00T source.

- Impact: Remote GR00T eval now works in the base container

## TODO
- Action chunking
- Pi remote policy
- Refactor policy cfg class
- Remove existing gr00t imple

---------

Signed-off-by: Xinjie Yao <xyao@nvidia.com>
(cherry picked from commit 3111385)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants