Skip to content

Add EnvGraphSpec class & its yaml parser#690

Open
xyao-nv wants to merge 4 commits into
mainfrom
xyao/dev/env_yaml
Open

Add EnvGraphSpec class & its yaml parser#690
xyao-nv wants to merge 4 commits into
mainfrom
xyao/dev/env_yaml

Conversation

@xyao-nv
Copy link
Copy Markdown
Collaborator

@xyao-nv xyao-nv commented May 21, 2026

Summary

Add EnvGraphSpec class & its yaml parser

Detailed description

  • Reason: Support loading environment graph YAML as typed structured data.
  • Changed: Added EnvGraphSpec/EnvGraphStateSpec, parser test, and maple-table pick-and-place fixture.
  • Impact: Enables graph-based env specs to be validated and consumed.

Next Step

  • Connect Spec to ArenaEnvBuilder

@xyao-nv xyao-nv marked this pull request as ready for review May 21, 2026 01:31
Copy link
Copy Markdown
Contributor

@isaaclab-review-bot isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Automated Code Review

Summary

This PR introduces a well-structured YAML parser for environment graph specifications using frozen dataclasses. The design enables typed, validated environment specs for graph-based simulation environments. The implementation is clean and the validation logic is thorough.

✅ Strengths

  • Immutable design: Frozen dataclasses prevent accidental mutation of spec objects after creation
  • Comprehensive validation: Reference integrity checks ensure task state_specs and constraint nodes exist
  • Clean factory pattern: from_dict and from_yaml methods provide clear construction APIs
  • Convenient accessors: nodes_by_id, tasks_by_id, state_specs_by_id properties enable O(1) lookups
  • Good test coverage: Tests verify happy path parsing, duplicate IDs, and invalid references

🔧 Issues to Address

1. Bug in _optional_dict helper (High)

File: env_graph_spec.py lines 196-199

def _optional_dict(data: dict[str, Any], key: str) -> dict[str, Any]:
    value = data.get(key, {})  # value defaults to {} when missing
    assert value is None or isinstance(value, dict), ...  # value can never be None here
    return dict(value or {})

The assertion cannot catch invalid types because data.get(key, {}) returns {} when the key is missing. If the key exists with a non-dict value (e.g., a string or list), the assertion works, but the None check is dead code.

Suggested fix:

def _optional_dict(data: dict[str, Any], key: str) -> dict[str, Any]:
    value = data.get(key)
    assert value is None or isinstance(value, dict), f"Optional field '{key}' must be a dict when set"
    return dict(value or {})

2. Assertions vs Exceptions for Input Validation (Medium)

Using assert for input validation is risky because assertions are disabled when Python runs with -O (optimized mode). In production, malformed YAML would silently pass validation.

Suggestion: Consider raising ValueError or defining a custom SpecValidationError:

if not isinstance(value, str) or not value:
    raise ValueError(f"Missing required string field '{key}'")

3. Node parent references not validated (Medium)

EnvGraphNodeSpec has a parent field, but _assert_references_exist does not verify that node parents reference valid node IDs. For example, a node with parent: "nonexistent_node" would be accepted.

Suggestion: Add validation in _assert_references_exist:

for node in nodes:
    if node.parent is not None:
        assert node.parent in node_ids, f"Node '{node.id}' references unknown parent '{node.parent}'"

4. Task task_args references not validated (Low)

The task_args dictionary in tasks (e.g., object, destination, background) contains node ID references that are not validated against the node list. This is consistent with the stated "Next Step" to connect specs to ArenaEnvBuilder, but could catch errors earlier.

📋 Test Coverage Suggestions

  • Add tests for malformed YAML (wrong types for required fields)
  • Add tests for empty collections (no nodes, no tasks, no state_specs)
  • Add a test for invalid node parent references
  • Test from_dict directly with edge cases, not just via from_yaml

💡 Minor Suggestions

  • Copyright year inconsistency: Python files use "2025-2026", YAML/test files use "2026"
  • Consider adding a module-level docstring to env_graph_spec.py
  • The *_by_id properties rebuild dicts on each access — consider caching if accessed frequently

Overall this is a solid foundation for typed environment graph specs. The _optional_dict bug should be fixed before merge, and adding node parent validation would improve reference integrity.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 21, 2026

Greptile Summary

This PR introduces EnvGraphSpec and EnvGraphStateSpec — typed dataclasses that parse environment-graph YAML files — along with a maple-table pick-and-place fixture and a parser test suite. The implementation is straightforward and well-structured, but the environments/__init__.py file was deleted as part of the change, which will break find_packages() discovery and prevent the entire environments subpackage from being installed.

  • env_graph_spec.py adds EnvGraphSpec/EnvGraphStateSpec dataclasses, a from_yaml/from_dict parser, and reference-integrity validation (unique IDs for nodes/tasks/state specs; cross-reference checks for constraints and task state refs).
  • __init__.py deletion removes the marker that setuptools.find_packages() requires to discover isaaclab_arena.environments; all modules in that directory will be missing from an installed package.
  • test_env_graph_spec.py covers happy-path loading and three error scenarios using a realistic YAML fixture.

Confidence Score: 4/5

Do not merge until the environments/__init__.py deletion is reverted; without it the whole environments subpackage will be excluded from installed distributions.

The accidental deletion of environments/__init__.py means find_packages() in setup.py will no longer discover isaaclab_arena.environments, causing every existing import from that subpackage to fail in any installed (non-editable) environment. The rest of the change is a clean new addition.

isaaclab_arena/environments/__init__.py — its deletion breaks package discovery and should be restored.

Important Files Changed

Filename Overview
isaaclab_arena/environments/init.py File deleted; its absence causes find_packages() to exclude the entire environments subpackage from installed distributions, breaking all existing imports.
isaaclab_arena/environments/env_graph_spec.py New typed dataclass hierarchy with YAML parser; validation uses assert (stripped by -O) and is missing uniqueness checks on per-state-spec constraint IDs and node parent reference validation.
isaaclab_arena/tests/test_env_graph_spec.py Covers happy-path loading and three error cases; positional index lookups for constraints make tests fragile to YAML reordering.
isaaclab_arena/tests/test_data/pick_and_place_maple_table_env_graph.yaml Well-structured fixture YAML with 5 nodes, 1 task, and 2 state specs; constraint IDs are correctly reused across state specs.

Class Diagram

%%{init: {'theme': 'neutral'}}%%
classDiagram
    class EnvGraphSpec {
        +str name
        +list~EnvGraphNodeSpec~ nodes
        +list~EnvGraphTaskSpec~ tasks
        +list~EnvGraphStateSpec~ state_specs
        +from_yaml(path) EnvGraphSpec
        +from_dict(data) EnvGraphSpec
        +nodes_by_id() dict
        +tasks_by_id() dict
        +state_specs_by_id() dict
    }
    class EnvGraphNodeSpec {
        +str id
        +str name
        +str type
        +str|None parent
        +str|None prim_path
        +str|None object_type
        +dict params
    }
    class EnvGraphTaskSpec {
        +str id
        +str name
        +str type
        +dict~str,str~ state_specs
        +dict task_args
    }
    class EnvGraphStateSpec {
        +str id
        +str name
        +EnvGraphEdgesSpec edges
    }
    class EnvGraphEdgesSpec {
        +list~EnvGraphConstraintSpec~ spatial_constraints
        +list~EnvGraphConstraintSpec~ task_constraints
    }
    class EnvGraphConstraintSpec {
        +str id
        +str type
        +str|None parent
        +str|None child
        +dict params
    }
    EnvGraphSpec --> EnvGraphNodeSpec : nodes
    EnvGraphSpec --> EnvGraphTaskSpec : tasks
    EnvGraphSpec --> EnvGraphStateSpec : state_specs
    EnvGraphStateSpec --> EnvGraphEdgesSpec : edges
    EnvGraphEdgesSpec --> EnvGraphConstraintSpec : constraints
    EnvGraphTaskSpec ..> EnvGraphStateSpec : references
    EnvGraphConstraintSpec ..> EnvGraphNodeSpec : parent/child refs
Loading

Comments Outside Diff (1)

  1. isaaclab_arena/environments/__init__.py

    P1 Accidental __init__.py deletion breaks package installation

    setup.py uses find_packages(include=["isaaclab_arena*"]), which only discovers directories that contain an __init__.py. Deleting this file turns isaaclab_arena.environments into a namespace package, so find_packages will no longer include it — and env_graph_spec.py, arena_env_builder.py, isaaclab_arena_environment.py, and all the other modules in that directory will be absent from an installed distribution. All existing imports (from isaaclab_arena.environments.arena_env_builder import ArenaEnvBuilder, etc.) will fail after a pip install. Every other subpackage in the repo (affordances, assets, cli, etc.) retains its __init__.py; this deletion appears unintentional.

Reviews (3): Last reviewed commit: "address comments" | Re-trigger Greptile

Comment on lines +165 to +180
def _required_str(data: dict[str, Any], key: str) -> str:
value = data.get(key)
assert isinstance(value, str) and value, f"Missing required string field '{key}'"
return value


def _optional_str(data: dict[str, Any], key: str) -> str | None:
value = data.get(key)
assert value is None or isinstance(value, str), f"Optional field '{key}' must be a string when set"
return value


def _optional_dict(data: dict[str, Any], key: str) -> dict[str, Any]:
value = data.get(key, {})
assert value is None or isinstance(value, dict), f"Optional field '{key}' must be a dict when set"
return dict(value or {})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 All three helper functions use assert for input validation. Python strips assert statements when the interpreter is run with -O or -OO (the optimize flag), so every type check silently becomes a no-op. For example, with -O, _required_str will return None (or a non-string value) without raising, and the caller gets a frozen dataclass with id=None, name=None, or type=None, which propagates errors to a much later (and harder to diagnose) point.

Suggested change
def _required_str(data: dict[str, Any], key: str) -> str:
value = data.get(key)
assert isinstance(value, str) and value, f"Missing required string field '{key}'"
return value
def _optional_str(data: dict[str, Any], key: str) -> str | None:
value = data.get(key)
assert value is None or isinstance(value, str), f"Optional field '{key}' must be a string when set"
return value
def _optional_dict(data: dict[str, Any], key: str) -> dict[str, Any]:
value = data.get(key, {})
assert value is None or isinstance(value, dict), f"Optional field '{key}' must be a dict when set"
return dict(value or {})
def _required_str(data: dict[str, Any], key: str) -> str:
value = data.get(key)
if not isinstance(value, str) or not value:
raise ValueError(f"Missing required string field '{key}'")
return value
def _optional_str(data: dict[str, Any], key: str) -> str | None:
value = data.get(key)
if value is not None and not isinstance(value, str):
raise TypeError(f"Optional field '{key}' must be a string when set, got {type(value).__name__}")
return value
def _optional_dict(data: dict[str, Any], key: str) -> dict[str, Any]:
value = data.get(key, {})
if value is not None and not isinstance(value, dict):
raise TypeError(f"Optional field '{key}' must be a dict when set, got {type(value).__name__}")
return dict(value or {})

Comment on lines +149 to +162
@property
def nodes_by_id(self) -> dict[str, EnvGraphNodeSpec]:
"""Return nodes keyed by id."""
return {node.id: node for node in self.nodes}

@property
def tasks_by_id(self) -> dict[str, EnvGraphTaskSpec]:
"""Return tasks keyed by id."""
return {task.id: task for task in self.tasks}

@property
def state_specs_by_id(self) -> dict[str, EnvGraphStateSpec]:
"""Return state specs keyed by id."""
return {state_spec.id: state_spec for state_spec in self.state_specs}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Silent data loss on duplicate IDs

nodes_by_id, tasks_by_id, and state_specs_by_id are plain dict comprehensions — if two entries share the same id, the later entry silently overwrites the earlier one with no error. A YAML author who accidentally duplicates a node ID (or a programmatic builder that emits a duplicate) will get a graph that is missing an entry with no indication of the problem. Since these properties are the primary lookup API for downstream consumers, this is a correctness hazard worth catching at parse time in from_dict.

Comment on lines +171 to +174
def _optional_str(data: dict[str, Any], key: str) -> str | None:
value = data.get(key)
assert value is None or isinstance(value, str), f"Optional field '{key}' must be a string when set"
return value
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 _optional_str accepts an empty string "" as a valid return value (the assert only guards the type, not emptiness). _required_str, by contrast, explicitly rejects empty strings via and value. A YAML entry like parent: "" would propagate as a non-None but semantically meaningless reference rather than being treated the same as parent: null. Aligning the guard makes the behaviour consistent.

Suggested change
def _optional_str(data: dict[str, Any], key: str) -> str | None:
value = data.get(key)
assert value is None or isinstance(value, str), f"Optional field '{key}' must be a string when set"
return value
def _optional_str(data: dict[str, Any], key: str) -> str | None:
value = data.get(key)
if value is not None and not isinstance(value, str):
raise TypeError(f"Optional field '{key}' must be a string when set, got {type(value).__name__}")
return value or None

Comment on lines +38 to +55
cube_limits = initial_state.edges.spatial_constraints[2]
assert cube_limits.type == "position_limits"
assert cube_limits.child == "rubiks_cube_hot3d_robolab"
assert cube_limits.params == {
"x_min": 0.55,
"x_max": 0.70,
"y_min": -0.40,
"y_max": -0.10,
}

final_state = spec.state_specs_by_id["state_spec_1"]
assert isinstance(final_state, EnvGraphStateSpec)
in_constraint = final_state.edges.spatial_constraints[3]
assert in_constraint.type == "in"
assert in_constraint.parent == "bowl_ycb_robolab"
assert in_constraint.child == "rubiks_cube_hot3d_robolab"

reach_constraint = final_state.edges.task_constraints[0]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Fragile positional constraint lookups

spatial_constraints[2], spatial_constraints[3], and task_constraints[0] rely on the YAML declaration order being stable. If a future edit reorders constraints in the fixture YAML, these assertions silently test the wrong constraint (or raise an IndexError). Prefer looking up by id — a helper {c.id: c for c in initial_state.edges.spatial_constraints} mirrors the pattern already used for nodes and tasks.

@xyao-nv xyao-nv force-pushed the xyao/dev/env_yaml branch from a35fea4 to 790734f Compare May 21, 2026 05:55
Comment on lines +219 to +243
def _assert_references_exist(
nodes: list[EnvGraphNodeSpec],
tasks: list[EnvGraphTaskSpec],
state_specs: list[EnvGraphStateSpec],
) -> None:
node_ids = {node.id for node in nodes}
state_spec_ids = {state_spec.id for state_spec in state_specs}

for task in tasks:
for label, state_spec_id in task.state_specs.items():
assert state_spec_id in state_spec_ids, (
f"Task '{task.id}' references unknown state spec '{state_spec_id}' for '{label}'"
)

for state_spec in state_specs:
constraints = state_spec.edges.spatial_constraints + state_spec.edges.task_constraints
for constraint in constraints:
if constraint.parent is not None:
assert constraint.parent in node_ids, (
f"Constraint '{constraint.id}' references unknown parent node '{constraint.parent}'"
)
if constraint.child is not None:
assert constraint.child in node_ids, (
f"Constraint '{constraint.id}' references unknown child node '{constraint.child}'"
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Node parent reference not validated

_assert_references_exist validates every constraint.parent and constraint.child field against node_ids, but never validates EnvGraphNodeSpec.parent. In the fixture, maple_table_robolab_table has parent: maple_table_robolab — a typo here (e.g., maple_table_robolab_x) would pass validation and produce a spec object whose parent silently points to a non-existent node, yielding a KeyError or None only when downstream code eventually tries to resolve it via nodes_by_id.

Comment thread isaaclab_arena/environments/__init__.py Outdated
Comment thread isaaclab_arena/environments/env_graph_spec.py
Comment thread isaaclab_arena/environments/env_graph_spec.py Outdated
@xyao-nv xyao-nv force-pushed the xyao/dev/env_yaml branch from 790734f to 967b27c Compare May 21, 2026 20:38
@isaaclab-review-bot
Copy link
Copy Markdown
Contributor

🔄 Follow-up review (commit 967b27c)

Reviewed the incremental changes from 790734f967b27c.

Changes in this push:

  • Documentation updates (step_2_teleoperation.rst, step_4_evaluation.rst)
  • Removed empty isaaclab_arena/environments/__init__.py

No changes to env_graph_spec.py or its tests.

Previous feedback status:

  • ✅ Duplicate ID validation — already in place via _assert_unique_ids()
  • Remaining items still open (original inline comments visible in the review thread)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants