Skip to content

Add isaaclab_arena_openpi package for pi0 droid eval#663

Open
cvolkcvolk wants to merge 6 commits into
mainfrom
cvolk/feature/openpi
Open

Add isaaclab_arena_openpi package for pi0 droid eval#663
cvolkcvolk wants to merge 6 commits into
mainfrom
cvolk/feature/openpi

Conversation

@cvolkcvolk
Copy link
Copy Markdown
Collaborator

@cvolkcvolk cvolkcvolk commented May 8, 2026

Summary

Adds isaaclab_arena_openpi/, a new top-level package that lets arena evaluate the upstream openpi pi0/pi05/pi0_fast droid policies. For convenience ships the current openpi server as a dockerized container fixed to current main.

What's in the package

  • policy/pi0_droid_remote_policy.pyPi0DroidRemotePolicy (PolicyBase subclass). Connects to an openpi server, fetches one action chunk per open_loop_horizon steps, reconnects on dropped websockets up to MAX_RECONNECT_ATTEMPTS. --Single-env only (openpi server is single-obs, asserted at runtime).

  • policy/pi0_droid_config.py — wire-format constants, the variant horizon table, and Pi0DroidRemotePolicyArgs.

  • docker/build_openpi_server.sh — clones upstream Physical-Intelligence/openpi at a pinned commit, builds the server image with upstream's own serve_policy.Dockerfile. Optionally pushes to ngc for use on osmo/ nightly etc.

Usage

./isaaclab_arena_openpi/docker/build_openpi_server.sh    # one-time, ~3 min, ~19 GB

# Terminal 1: start server (pi05 default)
docker run --rm -it --gpus all --network=host \
  -e XLA_PYTHON_CLIENT_MEM_FRACTION=0.5 \
  isaaclab_arena_openpi-server:latest \
  uv run scripts/serve_policy.py policy:checkpoint \
    --policy.config=pi05_droid_jointpos_polaris \
    --policy.dir=gs://openpi-assets-simeval/pi05_droid_jointpos

# Terminal 2: arena client (inside the arena docker container)
python isaaclab_arena/evaluation/policy_runner.py     --viz kit     --policy_type isaaclab_arena_openpi.policy.pi0_droid_remote_policy.Pi0DroidRemotePolicy     --policy_variant pi05     --remote_host localhost --remote_port 8000     --num_steps 500     --enable_cameras --num_envs 1     --language_instruction "Pick up the Rubik's cube and place it in the bowl."     pick_and_place_maple_table       --embodiment droid_abs_joint_pos       --pick_up_object rubiks_cube_hot3d_robolab       --destination_location bowl_ycb_robolab       --hdr home_office_robolab

@cvolkcvolk cvolkcvolk changed the title Add isaaclab_arena_openpi package + auto-derive PolicyBase CLI DRAFT Add isaaclab_arena_openpi package + auto-derive PolicyBase CLI May 8, 2026
@cvolkcvolk cvolkcvolk force-pushed the cvolk/feature/openpi branch from b4206e0 to e11720c Compare May 11, 2026 14:38
@cvolkcvolk cvolkcvolk changed the title DRAFT Add isaaclab_arena_openpi package + auto-derive PolicyBase CLI Add isaaclab_arena_openpi package for pi0 droid eval May 11, 2026
Copy link
Copy Markdown
Contributor

@isaaclab-review-bot isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: isaaclab_arena_openpi Package

Thanks for adding this openpi integration! The overall architecture is clean and follows the established patterns from isaaclab_arena_gr00t. I've reviewed the implementation and have a few observations:

✅ Strengths

  1. Clean separation of concerns - Wire-format constants and variant table in pi0_droid_config.py, execution logic in the policy module. This makes maintenance straightforward when upstream changes.

  2. Well-documented - The README thoroughly explains the layout, installation, variant table, and rationale for not bundling an environment. The inline docstrings in the policy are also helpful.

  3. Comprehensive unit tests - 13 tests covering wire format, chunk caching, binarisation, clipping, reconnect logic, and edge cases like short server responses. The stubbing approach is clever and allows testing without heavy dependencies.

  4. Defensive chunking - The cache-and-advance pattern with explicit flush on reconnect is a sensible approach for a single-env scenario.


🔍 Observations / Questions

1. Broad exception handling in reconnect logic

_call_server_with_retry catches OSError in addition to the websocket-specific exceptions:

retryable_exceptions = (
    websockets.exceptions.ConnectionClosedError,
    websockets.exceptions.ConnectionClosedOK,
    OSError,
)

OSError is quite broad and could potentially mask unrelated errors (e.g., file permission issues if any file I/O is involved). Consider narrowing to specific socket errors like ConnectionRefusedError and ConnectionResetError, or at least logging which specific exception triggered the retry.

2. Missing validation for numeric edge cases

_unpack_pi0_response validates shape but doesn't check for NaN/Inf values which could propagate silently to the action space:

def _unpack_pi0_response(self, server_response: dict[str, Any]) -> np.ndarray:
    action_chunk = np.asarray(server_response[WIRE_ACTIONS])
    # ... shape validation only

A quick np.isfinite check could catch malformed server responses early.

3. Test coverage gap: empty task description

set_task_description raises ValueError for empty/None input, but this path isn't covered by tests:

def set_task_description(self, task_description: str | None) -> str:
    if not task_description:
        raise ValueError(...)

Consider adding a quick test for this.

4. Minor: Hardcoded gripper binarisation threshold

The 0.5 threshold is noted as matching upstream checkpoint behavior, but if a future checkpoint changes this, it would require a code change. Could be added to Pi0Variant for extensibility, though this may be over-engineering for now.


📝 Note on PR Description

The PR description mentions changes to PolicyBase.from_args and PolicyBase.add_args_to_parser with derived defaults, but those changes don't appear in this diff. Is that intentional for a follow-up, or did they get dropped?


Overall this is solid work for a draft. Once the end-to-end tests against a live openpi server are validated, this should be good to go. 👍

@cvolkcvolk cvolkcvolk force-pushed the cvolk/feature/openpi branch 21 times, most recently from a10bde1 to ebc820a Compare May 12, 2026 13:02
Adds a thin client for the upstream openpi WebSocket policy server
(pi05_droid_jointpos and friends), packaged the same way
isaaclab_arena_gr00t is: own pyproject, no environments, no embodiments,
reuses arena's pick_and_place_maple_table by name with
droid_abs_joint_pos. Touches no files under isaaclab_arena/.

- Pi0DroidRemotePolicy connects via openpi_client.WebsocketClientPolicy,
  caches one action chunk + step counter (single-env only — openpi
  server is single-obs), reconnects on dropped websockets up to
  MAX_RECONNECT_ATTEMPTS, and binarises the gripper after each chunk.
- Variant table (pi05 / pi0 / pi0_fast) selects action_chunk_length and
  optional arm-action clipping. pi0_fast clips to ±1.0 (it emits joint
  deltas); pi05 leaves arm dims untouched.
- pi0_droid_config.py centralises wire-format constants, the variant
  table, and the runtime args dataclass. The policy module holds only
  execution logic + the CLI plumbing PolicyBase requires.
- docker/Dockerfile + build.sh extend isaaclab_arena:latest with
  openpi-client (pinned via OPENPI_COMMIT) and this package. Pragmatic
  dev-mode overlay; canonical external-repo pattern (FROM
  nvcr.io/nvidia/isaac-sim, install Isaac Lab + arena + this) is
  referenced in the README.
- 13 unit tests with stubbed openpi_client cover wire format, obs
  extraction, gripper binarisation, variant clipping, single-env
  enforcement, and retry behaviour.

Run via:
  /isaac-sim/python.sh isaaclab_arena/evaluation/policy_runner.py \
    --policy_type \
      isaaclab_arena_openpi.policy.pi0_droid_remote_policy.Pi0DroidRemotePolicy \
    --policy_variant pi05 \
    --remote_host <pi0-server> --remote_port 8000 \
    --enable_cameras --num_episodes 5 --num_envs 1 \
    pick_and_place_maple_table --embodiment droid_abs_joint_pos

Signed-off-by: Clemens Volk <cvolk@nvidia.com>
@cvolkcvolk cvolkcvolk force-pushed the cvolk/feature/openpi branch from ebc820a to 36a4c72 Compare May 13, 2026 08:17
Copy link
Copy Markdown
Contributor

@isaaclab-review-bot isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated Review (Post-Cleanup #2)

Thanks for addressing the S3 concern! The latest commits simplify documentation and fix the pi0_fast bucket issue.

✅ Changes Since Last Review (31ee8a4265b989)

  1. Fixed pi0_fast bucket - Changed from s3://openpi-assets-simeval/... to gs://openpi-assets/checkpoints/polaris/pi0_fast_droid_jointpos_polaris. Now all variants use GCS (gs://), no mixed storage backends! 🎉

  2. README simplification - Removed verbose explanation about checkpoint selection rationale (cleaner docs)

  3. Config/policy comment cleanup - Trimmed module docstrings and inline comments in pi0_droid_config.py and pi0_droid_remote_policy.py (less noise, same functionality)

  4. Minor code style - Changed raise ValueErrorassert for variant validation


⚠️ One Small Note

The raise ValueErrorassert change in pi0_droid_remote_policy.py means the check can be optimized away with python -O. If this is intentional (trusting callers), fine. If the validation should always run, consider keeping ValueError.


✅ Previous Review Highlights (Still Valid)

  • Clean action pipeline with gripper binarisation delegated to arena's config
  • Docker server build workflow for reproducibility
  • Polaris config variants with proper GCS checkpoints

✅ Overall

All previous concerns addressed. Cleanup looks good. Ready to merge! 👍


📌 Reviewed at: 265b989

The previous suite mixed contract tests with low-signal checks:
horizon constants compared against themselves, shape-only round-trips,
and a stray invariant assertion on MAX_RECONNECT_ATTEMPTS. Drop those
and keep only:

* test_pack_request_uses_pi0_wire_keys — the wire-format contract with
  the openpi droid server.
* test_get_action_caches_chunk_and_advances_index — the core cache and
  index-advancement behavior of get_action.
* test_call_server_with_retry_reconnects_on_drop — retry happy path.
* test_call_server_with_retry_gives_up_after_max_attempts — retry
  exhaustion.

11 tests -> 4. Same monkeypatch strategy on WebsocketClientPolicy so we
still exercise the real openpi_client.

Signed-off-by: Clemens Volk <cvolk@nvidia.com>
Adds isaaclab_arena_openpi/docker/build_server.sh: clones upstream
Physical-Intelligence/openpi at a pinned commit, builds with upstream's
own scripts/docker/serve_policy.Dockerfile, and appends a COPY step so
the resulting image is self-contained (no volume mount at runtime).

Tags as isaaclab_arena_openpi-server:<short-sha> and :latest locally.
With --push, retags and pushes to nvcr.io/nvstaging/isaac-amr/<image>
to match arena's existing push_to_ngc.sh convention.

Pins OPENPI_COMMIT in docker/Dockerfile.isaaclab_arena to the same
commit so the openpi-client baked into the arena image and the server
speak the same wire format.

Adapted from robolab/docker/build_openpi_server.sh.

Signed-off-by: Clemens Volk <cvolk@nvidia.com>
The previous quickstart asked users to clone openpi on the host and run
uv from there. Replace with the docker server flow now that
build_server.sh exists:

* Step 1 builds the openpi-server image via build_server.sh.
* Step 2 docker-runs the server with explicit --policy.config and
  --policy.dir flags so users see the full invocation.
* Step 3 is unchanged (arena client).

Document the working hybrid that grasps reliably: upstream's polaris
config (declares the Pi0 architecture) plus xuningy's training-output
bucket weights (load both params and norm stats, the norm stats from
--policy.dir override the config-baked path). Same architecture as
the polaris bucket weights, but trained without polaris cotrain data.

Split the variants table into separate --policy.config and --policy.dir
columns so the distinction is explicit. pi0_fast row flagged as
untested (needs s3fs in the image; configs reference s3://).

Signed-off-by: Clemens Volk <cvolk@nvidia.com>
* Rename build_server.sh to build_openpi_server.sh so the script is
  recognizable out of context.
* Trim header on build_openpi_server.sh to one descriptive line plus
  usage.
* Trim openpi-client comment in arena's Dockerfile to one line.
* Drop the eval_jobs_configs JSON and its README section; users can
  drive single rollouts with policy_runner.py for now.
* Drop the unit-test module docstring.

Signed-off-by: Clemens Volk <cvolk@nvidia.com>
* Convert if-raise ValueError on policy_variant to assert per AGENTS.md
  style guide.
* Reword task_description assertion message to state the prerequisite
  rather than imply a runtime fix ("set via --language_instruction or
  on the task definition").
* Drop three redundant private-method docstrings that restated the
  method name (_fetch_action_chunk, _extract_droid_observation,
  _pack_pi0_request) plus the stray "# Unreachable" comment.
* Restore "Fixed by upstream pi0 droid checkpoints" qualifier on
  ACTION_DIM / TARGET_IMAGE_SIZE to signal these are not tunable.
* README cleanup: drop the empirical-perf paragraph from the
  config-vs-weights block, fix the pi0_fast row to point at the
  upstream polaris GCS bucket instead of an s3:// path with a
  speculative s3fs caveat.

Signed-off-by: Clemens Volk <cvolk@nvidia.com>
@cvolkcvolk cvolkcvolk marked this pull request as ready for review May 13, 2026 16:36
RUN /isaac-sim/python.sh -m pip install av

# openpi remote-client. Pin OPENPI_COMMIT to the commit the server runs at.
ARG OPENPI_COMMIT=c23745b5ad24e98f66967ea795a07b2588ed6c79
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now I put the openpi dependency on the client into the arena base docker. We could in the future layer this instead ontop as a new container

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 13, 2026

Greptile Summary

Adds isaaclab_arena_openpi/, a new top-level package that wraps the upstream openpi WebSocket server as a PolicyBase subclass (Pi0DroidRemotePolicy) for evaluating pi0/pi05/pi0_fast DROID policies inside Isaac Lab Arena, along with a Docker build script that pins the server to a matching commit and unit tests covering the wire-format contract and reconnect logic.

  • Pi0DroidRemotePolicy fetches action chunks over WebSocket, caches them for open_loop_horizon steps, and reconnects with retry on dropped connections; single-env only, enforced by assertion.
  • build_openpi_server.sh clones upstream openpi at a pinned SHA, appends a COPY . /app layer to the upstream Dockerfile, and optionally pushes to NGC — the pinned commit matches OPENPI_COMMIT in the arena Dockerfile so client and server share the same wire format.
  • Dockerfile.isaaclab_arena installs openpi-client at the same pinned commit via --no-deps pip install, consistent with other sub-package additions.

Confidence Score: 4/5

Safe to merge; the new package is self-contained and does not modify existing evaluation paths.

The implementation is clean and well-tested. The only notable items are a docstring that overstates the reconnect count by one and no explicit close/cleanup path for the WebSocket client resource. Neither affects correctness during a normal eval run.

pi0_droid_remote_policy.py — the retry docstring and missing WebSocket cleanup are worth a quick look before this pattern is copied to future remote policies.

Important Files Changed

Filename Overview
isaaclab_arena_openpi/policy/pi0_droid_remote_policy.py New Pi0DroidRemotePolicy class with WebSocket-based action fetching, open-loop chunk caching, and retry logic; docstring on _call_server_with_retry overstates reconnect count by one, and no close() path exists for the WebSocket resource.
isaaclab_arena_openpi/policy/pi0_droid_config.py Clean constants and dataclass config file; variant horizon table is straightforward and easy to extend.
isaaclab_arena_openpi/tests/test_pi0_droid_remote_policy.py Well-structured unit tests covering wire-format keys, chunk caching/indexing, reconnect recovery, and give-up-after-max-attempts; patching strategy correctly targets _wait_for_server to avoid real connections.
isaaclab_arena_openpi/docker/build_openpi_server.sh Correct pinned-commit Docker build workflow; does a full git clone that could be made faster with --filter=blob:none.
docker/Dockerfile.isaaclab_arena Adds pinned openpi-client installation and COPY for the new package directory; consistent with how other sub-packages are included.
setup.py Trivial addition of isaaclab_arena_openpi* to the find_packages list; matches the pattern used by other sub-packages.

Sequence Diagram

sequenceDiagram
    participant Runner as policy_runner.py
    participant Policy as Pi0DroidRemotePolicy
    participant Client as WebsocketClientPolicy
    participant Server as openpi server (Docker)

    Runner->>Policy: set_task_description(instruction)
    loop Every open_loop_horizon steps
        Runner->>Policy: get_action(env, observation)
        Policy->>Policy: extract DROID obs (cameras + proprio)
        Policy->>Policy: pack pi0 wire-format request
        Policy->>Client: infer(request)
        Client-->>Server: WebSocket send
        Server-->>Client: action chunk (H x 8)
        Client-->>Policy: actions ndarray
        Policy->>Policy: cache chunk[:open_loop_horizon]
        Policy-->>Runner: action[0] as torch.Tensor (1x8)
    end
    loop Cached steps (open_loop_horizon - 1 remaining)
        Runner->>Policy: get_action(env, observation)
        Policy-->>Runner: cached action[i] as torch.Tensor (1x8)
    end
    note over Client,Server: On ConnectionClosedError/OSError: reconnect up to MAX_RECONNECT_ATTEMPTS-1 times, flush cache on each reconnect
Loading

Reviews (1): Last reviewed commit: "Address review feedback: trim docstrings..." | Re-trigger Greptile

Comment on lines +155 to +156
def _call_server_with_retry(self, server_request: dict[str, Any]) -> dict[str, Any]:
"""Send the request, reconnecting up to ``MAX_RECONNECT_ATTEMPTS`` times.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The docstring claims to reconnect "up to MAX_RECONNECT_ATTEMPTS times" but the loop structure only reconnects MAX_RECONNECT_ATTEMPTS - 1 (2) times — the loop has 3 iterations, but the last iteration raises immediately on failure without reconnecting. A caller or operator who reads the docstring and tunes MAX_RECONNECT_ATTEMPTS will get one fewer reconnect than expected.

Suggested change
def _call_server_with_retry(self, server_request: dict[str, Any]) -> dict[str, Any]:
"""Send the request, reconnecting up to ``MAX_RECONNECT_ATTEMPTS`` times.
def _call_server_with_retry(self, server_request: dict[str, Any]) -> dict[str, Any]:
"""Send the request, making up to ``MAX_RECONNECT_ATTEMPTS`` total attempts
(i.e. reconnecting up to ``MAX_RECONNECT_ATTEMPTS - 1`` times).

Comment on lines +48 to +51
self._websocket_client = websocket_client_policy.WebsocketClientPolicy(
host=self._remote_host, port=self._remote_port
)
print("[Pi0DroidRemotePolicy] Connected.")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Missing WebSocket cleanup on policy teardown

_websocket_client is created in __init__ but there is no close() or __del__ override to shut the connection down. If the runner exits or a new policy is instantiated in the same process, the underlying TCP connection and async event-loop resources held by WebsocketClientPolicy are leaked. Consider adding a close() (or __del__) that calls the equivalent cleanup method the upstream client exposes so the resource is freed when the policy is discarded.

Comment on lines +32 to +34
GIT_LFS_SKIP_SMUDGE=1 git clone --quiet "$OPENPI_REPO" "$TMPDIR/openpi"
cd "$TMPDIR/openpi"
git checkout "$COMMIT"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Full-history clone is slower than necessary

git clone without --filter=blob:none fetches the complete object graph including every historical blob. Consider adding --filter=blob:none (partial clone — fetches tree and commit objects only, blobs fetched on demand) to shorten the one-time build step without changing correctness.

IMAGE_NAME="${IMAGE_NAME:-isaaclab_arena_openpi-server}"
NGC_PATH="${NGC_PATH:-nvcr.io/nvstaging/isaac-amr/${IMAGE_NAME}}"
OPENPI_REPO="https://github.com/Physical-Intelligence/openpi"
DEFAULT_COMMIT="c23745b5ad24e98f66967ea795a07b2588ed6c79"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you unify this spec defining the src of openpi with Dockerfile? I wonder if having a shared .env could work.

@@ -0,0 +1,99 @@
# isaaclab_arena_openpi
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can it be doc rst instead of readme?

set -euo pipefail

IMAGE_NAME="${IMAGE_NAME:-isaaclab_arena_openpi-server}"
NGC_PATH="${NGC_PATH:-nvcr.io/nvstaging/isaac-amr/${IMAGE_NAME}}"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

External users may not have write access to this path

trap 'rm -rf "$TMPDIR"' EXIT

echo "Cloning openpi at ${COMMIT} ..."
GIT_LFS_SKIP_SMUDGE=1 git clone --quiet "$OPENPI_REPO" "$TMPDIR/openpi"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if the clone could be optional as users may already have a PI src repo and just want to build server docker (if Pi did not ship it).

Comment on lines +49 to +63
docker build \
--network=host \
-f "$TMPDIR/Dockerfile" \
-t "${IMAGE_NAME}:${SHORT_HASH}" \
-t "${IMAGE_NAME}:latest" \
.

echo "Built ${IMAGE_NAME}:${SHORT_HASH} (also tagged :latest)"

if [ "$PUSH" = true ]; then
echo "Pushing to ${NGC_PATH}:${SHORT_HASH}"
docker tag "${IMAGE_NAME}:${SHORT_HASH}" "${NGC_PATH}:${SHORT_HASH}"
docker push "${NGC_PATH}:${SHORT_HASH}"
echo "Pushed ${NGC_PATH}:${SHORT_HASH}"
fi
Copy link
Copy Markdown
Collaborator

@xyao-nv xyao-nv May 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we imagine every policy builds their own docker, then can we make the docker build & push more universal?
Not bind to Pi server docker.

I personally think it's a valid assumption.

self._next_chunk_step += 1
return torch.from_numpy(next_action_np).to(dtype=torch.float32, device=self.device).unsqueeze(0)

def reset(self, env_ids: torch.Tensor | None = None) -> None:
Copy link
Copy Markdown
Collaborator

@xyao-nv xyao-nv May 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a TODO for env_id reset when parallel is supported?

)
)

def get_action(self, env: gym.Env, observation: dict[str, Any]) -> torch.Tensor:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a TODO for extending to parallel envs?

self._next_chunk_step = 0

def _fetch_action_chunk(self, observation: dict[str, Any]) -> np.ndarray:
droid_obs = self._extract_droid_observation(observation)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it have to droid dependent? Can you extend to other embodiments?
Can the dict key be a dataclass/cfg?


def _extract_droid_observation(self, observation: dict[str, Any]) -> dict[str, np.ndarray]:
cam = observation["camera_obs"]
proprio = observation["policy"]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We kinda enforce "camera_obs" as the key when inserting camera to embodiment. But how about "policy"? Is it fixed?

)


class Pi0DroidRemotePolicy(PolicyBase):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the idea of limiting the class to only robot specific closedloop policy.
It's okay that you have only tested with Droid, but when building the policy class, it shall be embodiment agnostic, i.e. Pi0RemotePolicy.
You can provide embodiment specific cfg as the data member.

Copy link
Copy Markdown
Collaborator

@xyao-nv xyao-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I have to request changes -- can policy class be embodiment agostic?

remote_port=args.remote_port,
)
)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When closing the connection (esp it's used in eval_runner), what's the expected components needed? Gr00t side I have added closing socket and cleaning context to prevent stalling memory.

"prompt": language_instruction,
}

def _call_server_with_retry(self, server_request: dict[str, Any]) -> dict[str, Any]:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this func could be in the parent class RemotePolicy. Retry with max attempts will always be needed.

name = "pi0_droid_remote"
config_class = Pi0DroidRemotePolicyArgs

def __init__(self, config: Pi0DroidRemotePolicyArgs) -> None:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need action_scheduler_cls? This MR seems to only contain obs -> action_chunk. Can you add a TODO for action_chunk to action

Comment on lines +9 to +21
ACTION_DIM = 8
TARGET_IMAGE_SIZE = (224, 224)

# How many actions to replay before refetching a new chunk from the server.
OPEN_LOOP_HORIZON_BY_VARIANT: dict[str, int] = {
"pi05": 15,
"pi0": 10,
"pi0_fast": 10,
}
DEFAULT_VARIANT = "pi05"

ARENA_EXTERNAL_CAMERA_KEY = "external_camera_rgb"
ARENA_WRIST_CAMERA_KEY = "wrist_camera_rgb"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we do a closedloop-embodiment dataclass? Hydra?
So we can unify how cfg are supposed to set, you have py, and g00t has yaml.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants