feat(ir): startNode(r)/endNode(r) return the endpoint node value (#753)#892
Conversation
`startNode(r)` / `endNode(r)` over a matched fixed-hop relationship now return the start/end node with identity, labels, and readable properties (reusing #785's node-value materialization), not a bare UUID. The binder records each fixed hop's `(src, dst)` node vars at `Expand` build (`edge_vars`) and rewrites the calls onto them: - general/property-access context (`startNode(r).name`) -> a node VarRef, so the access resolves against the endpoint node's columns; - terminal RETURN (`RETURN endNode(r)`) -> a whole node value via the shared `_node_struct` helper. Variable-length edge vars bind to a relationship *list*, not one relationship, so they are deliberately not recorded; any unrecognized argument falls through to the usual `UnknownFunction` error rather than silently returning a UUID (which would fail TCK). Closes the #743 deferral. Validated end-to-end: `start_end_node_return_whole_node` and `start_end_node_property_access` in gf-api e2e_baseline (73 pass); binder + lowerer goldens unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
WalkthroughThe binder gains an ChangesstartNode/endNode Endpoint Materialization
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related issues
Possibly related PRs
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
crates/gf-api/tests/e2e_baseline.rs (1)
146-154: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winAlign test scope with its name (
start_end_node_return_whole_node).This test currently validates only
startNode(r)as a whole node. Either rename it to reflect start-only coverage or assertendNode(r)whole-node materialization in the same test to match the stated scope.Suggested direction
- "MATCH (a:Person {name:'Alice'})-[r:KNOWS]->(b:Person {name:'Bob'}) RETURN startNode(r)", + "MATCH (a:Person {name:'Alice'})-[r:KNOWS]->(b:Person {name:'Bob'}) \ + RETURN startNode(r) AS s, endNode(r) AS e",Then downcast/assert both
sandeasStructArrayand verify readablenamevalues (Alice/Bob).🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/gf-api/tests/e2e_baseline.rs` around lines 146 - 154, The test function start_end_node_return_whole_node currently only validates startNode(r) but the test name indicates it should validate both startNode and endNode. Expand the test to include a query for endNode(r) in addition to the existing startNode(r) query, then assert that both results are properly materialized as whole nodes by downcasting them as StructArray and verifying the readable name properties (Alice for startNode, Bob for endNode) to ensure the endpoint nodes are fully materialized with their identity, labels, and properties.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/gf-ir/src/binder.rs`:
- Around line 333-340: The edge_vars insertion doesn't account for relationship
direction, causing startNode(r) and endNode(r) to resolve to incorrect endpoints
for reversed or undirected edges. When recording the edge variable relationship
in the edge_vars map (in the !is_var_hop block around line 333-340, and also at
lines 1208-1218), check whether the edge traversal is reversed (incoming
direction) and swap the src_var and dst_var values accordingly before insertion.
Additionally, for undirected edges, either skip the insertion entirely or add
special handling to mark them as requiring runtime direction resolution, rather
than using a static left/right assignment.
- Around line 244-246: The propagation of edge_vars from OPTIONAL MATCH in the
loop iterating over sub_state.edge_vars does not preserve null-gating semantics
required by Cypher. When an optional match edge is unmatched (null), accessing
properties through endpoint resolution like startNode(r) should return null, but
currently allows reading from the endpoint node regardless. Either carry the
null-gate information through endpoint resolution and gate endpoint property
access and materialization on the edge variable being non-null, or conditionally
skip this edge_vars propagation when processing OPTIONAL MATCH edges until the
null-gate mechanism is in place. Apply the same fix to the other occurrences at
lines 857-872, 1082-1088, and 1216-1238.
---
Nitpick comments:
In `@crates/gf-api/tests/e2e_baseline.rs`:
- Around line 146-154: The test function start_end_node_return_whole_node
currently only validates startNode(r) but the test name indicates it should
validate both startNode and endNode. Expand the test to include a query for
endNode(r) in addition to the existing startNode(r) query, then assert that both
results are properly materialized as whole nodes by downcasting them as
StructArray and verifying the readable name properties (Alice for startNode, Bob
for endNode) to ensure the endpoint nodes are fully materialized with their
identity, labels, and properties.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 8d422a13-a12e-4acb-b361-961a15c9f8e8
⛔ Files ignored due to path filters (1)
CHANGELOG.mdis excluded by!**/*.md
📒 Files selected for processing (2)
crates/gf-api/tests/e2e_baseline.rscrates/gf-ir/src/binder.rs
…753) The shared helper never reads `self` (it works through the `BinderState` builder), so `cargo clippy --workspace -- -D warnings` rejected the `&self` receiver. Make it an associated function and call it as `Self::node_struct_expr`. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…eRabbit) Two correctness fixes from review: - Direction: `src_var`/`dst_var` are the pattern's left/right vars, but the relationship's true start/end follow its direction. Record `(src, dst)` for outgoing, swap to `(dst, src)` for incoming, and skip undirected edges (their orientation is per matched row, so a static rewrite could pick the wrong endpoint). Validated: `(Bob)<-[r:KNOWS]-(b)` -> startNode = Alice (source). - OPTIONAL MATCH: do not propagate `edge_vars` out of an optional sub-state. After an unmatched optional, `r` is null and `startNode(r)` must be null, but the rewrite would resolve to the outer non-null endpoint var. Without an edge-uuid null gate (#889) that yields a wrong value, so optional edges stay unresolved (startNode/endNode -> UnknownFunction). Edges still resolve inside the optional's own WHERE, where matched rows have a non-null `r`. New e2e: `start_end_node_respects_incoming_direction`, `start_node_over_optional_edge_is_deferred`. clippy --workspace -D warnings clean; gf-ir goldens (92) and e2e (75) green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/gf-api/tests/e2e_baseline.rs`:
- Around line 271-277: The test at the execute call in the e2e_baseline.rs test
is not actually exercising the null optional branch because Alice always has
KNOWS edges in the test fixture, so the optional match will always succeed with
r being non-null. Modify the MATCH query to use a person or relationship pattern
that actually yields a null result from the OPTIONAL MATCH clause. For example,
change the query to match a person who has no outgoing edges of the specified
type, or use a relationship type that doesn't exist in the fixture, so that the
optional match returns null and the startNode(r) call is forced to handle the
null case as intended by the test assertion.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 1bf6a397-5d71-4c53-b731-7925f421a39b
⛔ Files ignored due to path filters (1)
CHANGELOG.mdis excluded by!**/*.md
📒 Files selected for processing (2)
crates/gf-api/tests/e2e_baseline.rscrates/gf-ir/src/binder.rs
…CodeRabbit) Switch the OPTIONAL-MATCH deferral test from Alice (who has KNOWS edges) to Eve (no outgoing edges), so the represented scenario is a genuinely unmatched optional. The deferral itself surfaces as a bind-time UnknownFunction (data-independent), but the data now matches the intent. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Summary
startNode(r)/endNode(r)over a matched fixed-hop relationship now return the endpoint node value — identity, labels, and readable properties — reusing the #785 node-value materialization, instead of a bare UUID (or the priorUnknownFunctionerror). Closes the #743 deferral.Approach
A binder-only rewrite (no new lowering or exec):
Expandbuild records each fixed hop's(src, dst)node vars in a newBinderState.edge_varsmap (threaded through OPTIONAL-MATCH sub-states likenode_vars/path_vars).resolve_endpoint_nodemapsstartNode(r)/endNode(r)over a recorded edge var to the src/dst node var.startNode(r).name) lowers to a nodeVarRef, so the access resolves against the endpoint node's columns.RETURN endNode(r)) upgrades to a whole node value via a sharednode_struct_exprhelper (the same_node_structpath as a bareRETURN n).Deliberately scoped out: variable-length edge vars bind to a relationship list, not a single relationship, so they are not recorded —
startNode/endNodeover a*edge (and any other unrecognized argument) falls through to the usualUnknownFunctionerror rather than silently returning a UUID (which would fail TCK).Validation
crates/gf-api/tests/e2e_baseline.rs):start_end_node_return_whole_node(whole-node return, labels + props readable) andstart_end_node_property_access(startNode(r).name/endNode(r).name). 73 pass.cargo fmt --all -- --checkclean.startNode/endNodefeature scenarios remain@skip-rustuntil the read-path tier (tck: OPTIONAL MATCH, variable-length paths, and UNWIND on Rust core #600) un-skips them; this PR is the engine capability they depend on.Closes #753.
🤖 Generated with Claude Code
Note
Make
startNode(r)/endNode(r)return full node values for fixed-hop directed relationshipsstartNode(r)andendNode(r)now return a whole node value (identity, labels, properties) instead of a UUID when the relationship is a fixed-hop directed edge in a non-optional MATCH.edge_varsmap onBinderState, populated duringlower_matchforDirection::OutandDirection::Inedges.startNode(r).namenow binds correctly to the endpoint node variable, enabling column resolution at query time.startNode/endNoderemain unresolved (surfacing asUnknownFunction) for OPTIONAL MATCH edges, undirected relationships, and variable-length hops — these cases are explicitly deferred.Macroscope summarized 48d1e95.
Summary by CodeRabbit
New Features
startNode(r)andendNode(r)now materialize as full node structures (labels + readable properties) so you can access fields likestartNode(r).name/endNode(r).name.Bug Fixes
OPTIONAL MATCHrelationships, endpoint evaluation now fails when the relationship isnull, preventing misleading endpoint values.Tests
RETURN startNode(r)materialization, endpoint property access, direction handling, andOPTIONAL MATCHnull behavior.