feat(rel): materialize whole node values for RETURN n#890
Conversation
A bare `RETURN n` over a NodeScan-bound variable now yields a whole node value
— Struct{node_uuid, labels: List<Utf8>, <typed properties>} — instead of
failing on the bare `var_N` qualifier. Unblocks #753 (startNode/endNode
returning a node) and the TCK whole-node scenarios.
- gf-ir/binder: track node vars (+ their pattern label); rewrite a bare node var
in a TERMINAL RETURN to `_node_struct(VarRef, "Label")`. WITH / intermediate
projections are left as references (materializing there would break downstream
property access). The label is threaded because the lowerer's ontology map is
empty in exploratory mode.
- gf-rel: assemble the value at lowering time — `node_value_struct` composes
named_struct{node_uuid, labels, <props>} gated null_unless(node_uuid present);
the property columns come from a per-plan VarId->NodeShape map built from the
NodeScans (the same columns join_node_properties materializes as var_N.<prop>).
- e2e: `MATCH (n:Person {name:'Alice'}) RETURN n` returns a struct with node_uuid,
labels ["Person"], and a readable `name` = "Alice".
Validated: gf-cypher / gf-ir / gf-rel suites + gf-api e2e_baseline green; fmt clean.
Scope: labeled patterns + terminal RETURN. Deferred to follow-ups: the unlabelled
`MATCH (n) RETURN n` label (needs a runtime type_id->name catalog reverse-map),
relationship/path values, and the TCK `(:Label {..})` render + scenario un-skipping.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
WalkthroughImplements whole-node-value materialization for ChangesNode-value materialization for RETURN n
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related issues
Possibly related PRs
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/gf-ir/src/binder.rs (1)
743-751: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick winMaterialize node group keys in aggregate
RETURNs too.
RETURN n, count(*)enters the aggregate branch and bypasseslower_return_items(..., true), so the non-aggregatenis still lowered as a bareVarRefinstead of_node_struct. That preserves the originalcol("var_N")failure mode for terminal aggregate returns.🐛 Proposed direction
- if r.items.iter().any(|i| agg_func_of(&i.expr).is_some()) { - self.lower_return_aggregate(&r.items, s); + if r.items.iter().any(|i| agg_func_of(&i.expr).is_some()) { + self.lower_return_aggregate(&r.items, s, true);Then use the same helper for non-aggregate group keys:
- fn lower_return_aggregate(&self, items: &[ReturnItem], s: &mut BinderState) { + fn lower_return_aggregate( + &self, + items: &[ReturnItem], + s: &mut BinderState, + materialize_nodes: bool, + ) { ... - group_by.push(self.lower_expr(&item.expr, item.span, s)); + group_by.push( + self.lower_return_item_expr(&item.expr, item.span, s, materialize_nodes), + );🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/gf-ir/src/binder.rs` around lines 743 - 751, The `lower_return_aggregate` method is not materializing non-aggregate items (group keys) in the same way that `lower_return_items` does when called with the third parameter as true, causing nodes in aggregate RETURN statements like `RETURN n, count(*)` to remain as bare VarRef instead of being materialized as node structs. Update `lower_return_aggregate` to apply the same materialization helper or logic to the non-aggregate items before processing them, ensuring that group keys are properly transformed into node structures just as they would be in non-aggregate returns.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/gf-rel/src/lowerer.rs`:
- Around line 226-238: The build_node_shapes function only iterates through
top-level ops but fails to collect node shapes from operations nested within
GraphOp::Optional children. This causes node variables introduced inside
OPTIONAL MATCH blocks to have missing shape information. Modify the
build_node_shapes function to also handle the GraphOp::Optional variant by
recursively processing its nested operations in addition to the top-level
GraphOp::NodeScan handling, ensuring that node shape information is collected
regardless of nesting depth within Optional blocks.
- Around line 387-389: The node_shapes cache is only being seeded in the
lower_plan method, but the mixed statement path uses lower_prefix followed by
lower_value_expr, which now depends on having initialized node_shapes. This can
result in empty or stale shapes for statements like MATCH ... SET/CREATE ...
RETURN n. Add the same node_shapes initialization logic (calling
build_node_shapes with the plan's ops) to the lower_prefix method or at the
entry point of the mixed statement path, before lower_value_expr is called, to
ensure node_shapes is properly seeded for that code path as well.
---
Outside diff comments:
In `@crates/gf-ir/src/binder.rs`:
- Around line 743-751: The `lower_return_aggregate` method is not materializing
non-aggregate items (group keys) in the same way that `lower_return_items` does
when called with the third parameter as true, causing nodes in aggregate RETURN
statements like `RETURN n, count(*)` to remain as bare VarRef instead of being
materialized as node structs. Update `lower_return_aggregate` to apply the same
materialization helper or logic to the non-aggregate items before processing
them, ensuring that group keys are properly transformed into node structures
just as they would be in non-aggregate returns.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: e283836f-129e-4eea-8b5a-4acd1fcf2af1
📒 Files selected for processing (4)
crates/gf-api/tests/e2e_baseline.rscrates/gf-ir/src/binder.rscrates/gf-rel/src/expr.rscrates/gf-rel/src/lowerer.rs
Summary
A bare
RETURN nnow materializes a whole node value —Struct{node_uuid, labels: List<Utf8>, <typed properties>}— instead of failing on the barevar_Ntable qualifier. This satisfies #785'sacceptance ("
RETURN nreturns a node value whose properties are readable downstream") and unblocks#753 (startNode/endNode returning a node) and the TCK whole-node scenarios.
Closes #785. Part of M17 (Phase-1 critical path).
How
RETURN to
_node_struct(VarRef, "Label"). WITH / intermediate projections are left asreferences (materializing there would break downstream
n.ageproperty access). The label isthreaded from the pattern because the lowerer's ontology map is empty in exploratory mode.
node_value_structcomposesnamed_struct{node_uuid, labels, <props>}, gatednull_unless(node_uuid present)(so an unmatchedOPTIONAL row yields a null node). Property columns come from a per-plan
VarId→NodeShapemap builtfrom the
NodeScans — the same columnsjoin_node_propertiesmaterializes asvar_N.<prop>.Validation
MATCH (n:Person {name:'Alice'}) RETURN n→ struct withnode_uuid, labels["Person"],readable
name = "Alice".with_pipelineIRgolden is unchanged (WITH correctly not materialized);
cargo fmt --all -- --checkclean.Scope / deferred (tracked in #889)
Labeled patterns + terminal RETURN. The unlabelled
MATCH (n) RETURN nlabel (needs a runtimetype_id→namecatalog reverse-map), relationship/path values, and the TCK(:Label {..})render +scenario un-skipping are #889. Until then, an unlabelled
RETURN nyields a safe uuid-only struct.🤖 Generated with Claude Code
Note
Materialize whole node values for bare
RETURN nin graph queries_node_structIR function that assembles a{node_uuid, labels, <props…>}DataFusion struct for a node variable, gated bynull_unless(node_uuid IS NOT NULL)for OPTIONAL MATCH rows.node_varsand rewrites bare node variables in terminalRETURNto_node_struct(var[, label])calls;WITHclauses are explicitly excluded from this rewrite._node_structby building anamed_structwithnode_uuid, alabelsList, and any persisted property columns looked up from a newnode_shapesmap.NodeScanoperators to buildVarId → NodeShape(persisted property columns) before expression lowering.RETURN nnow returns a StructArray value instead of a raw node reference;WITH nbehavior is unchanged.Macroscope summarized 6cc5ccc.
Summary by CodeRabbit