feat(ir): lower and execute read-path WITH (#814 slice 1)#894
Conversation
A `MATCH … WITH … RETURN` pipeline now projects/renames mid-query and chains; before, `GraphOp::With` hit "operator not yet lowered". Binder (lower_with): - lower each WITH item's expression in the current scope, mint a fresh `out_var` for its alias, then RESET the scope to exactly those aliases (Cypher semantics) — a variable not carried forward is out of scope, so referencing it errors rather than returning a wrong result; - a WITH WHERE / ORDER BY is lowered against the new scope. - Guards (reject, don't mishandle): carrying a whole node/path through WITH (`WITH n` — a node spans multiple columns) and aggregation in WITH. IR: `ProjectItem.out_var` (serde-default) links a WITH alias to the var it introduces, so the lowerer can map it. `None` for terminal RETURN. Lowerer (lower_with_op): project each item to its alias column, register `out_var -> column` in the var-map, then filter on the WITH WHERE over the projected columns. Deferred to later #814 slices: whole-node/path pass-through, aggregation in WITH, write-result RETURN, and mid-statement reads-after-writes. Validated: new e2e (project/rename, WHERE post-projection, scope-reset errors, node-passthrough deferral) — 80 pass; IR goldens updated for `out_var` + the `with_pipeline` golden re-pointed to a scalar WITH; gf-rel/gf-exec/gf-cli green; clippy --workspace -D warnings clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Walkthrough
ChangesWITH Scope Reset and Alias Propagation
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related issues
Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (3)
crates/gf-api/tests/e2e_baseline.rs (2)
347-365: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick winStrengthen WITH happy-path assertions beyond row counts.
These two tests currently validate only cardinality, so alias/projection regressions can slip through if row counts still match. Assert returned values (and alias column) in addition to
rows_produced.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/gf-api/tests/e2e_baseline.rs` around lines 347 - 365, The test functions `with_projection_simple` and `with_where_filters_post_projection` currently only validate row cardinality through `rows_produced` assertions, which allows regressions in actual data values and column aliases to go undetected. Strengthen both tests by adding assertions that verify the actual returned values in the result set in addition to the row count check. For the first test, assert that all five names are correctly projected in the nm column, and for the second test, assert that the returned value is specifically 'Alice'. This ensures alias projections and filtered data correctness, not just cardinality.
374-391: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick winAssert the expected error reason, not just
is_err().Both negative tests pass on any error type. Please assert the error message/kind reflects the intended contract (
bout of scope afterWITH, and whole-nodeWITH nunsupported) to avoid false positives.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/gf-api/tests/e2e_baseline.rs` around lines 374 - 391, The negative test assertions in with_node_out_of_scope and with_node_passthrough_is_deferred functions only verify that an error occurred with is_err(), but do not validate that the error contains the expected reason or message. To fix this, inspect the actual error result by extracting the error details and asserting that the error message contains the specific expected text (e.g., that b is out of scope in the first test, and that whole-node WITH is unsupported in the second test). Replace the generic is_err() checks with assertions that validate the error kind or message matches the intended contract to prevent false positives.crates/gf-ir/src/plan.rs (1)
688-688: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winExercise the non-
Noneout_varround-trip forWITH.This
GraphOp::Withcase is the new IR contract that should preserveout_varfrom binder to lowerer; keeping itNoneonly covers the terminal-RETURNshape.Test-contract tweak
- out_var: None, + out_var: Some(v0),🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/gf-ir/src/plan.rs` at line 688, The `GraphOp::With` case at the `out_var: None` assignment is only covering the terminal-RETURN shape of the IR contract. You need to add or modify test coverage to exercise the non-None `out_var` round-trip scenario for the WITH operation, ensuring that the `out_var` is preserved from the binder to the lowerer when it has an actual value rather than None. This will validate that the new IR contract properly handles cases where `out_var` is not None.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/gf-ir/src/binder.rs`:
- Around line 759-765: The aggregate function check in the WITH clause
validation using agg_func_of(&item.expr) only detects top-level aggregate calls
and misses nested aggregates like count(n) + 1. Replace the current shallow
check with a recursive function that traverses the entire expression tree to
find aggregate functions at any depth, ensuring all aggregate usage patterns in
WITH clauses are properly rejected before reaching the lowerer.
In `@crates/gf-rel/src/lowerer.rs`:
- Around line 751-755: The VarMap is not being reset when installing the WITH
scope, which leaves pre-WITH variables resolvable in later lowering paths.
Before the loop that iterates through items and inserts aliases into var_map,
clear the var_map to remove all pre-WITH variable bindings. This ensures that
only the WITH clause aliases (from item.out_var and item.alias) remain in scope,
properly resetting the scope boundary as required by the binder contract.
---
Nitpick comments:
In `@crates/gf-api/tests/e2e_baseline.rs`:
- Around line 347-365: The test functions `with_projection_simple` and
`with_where_filters_post_projection` currently only validate row cardinality
through `rows_produced` assertions, which allows regressions in actual data
values and column aliases to go undetected. Strengthen both tests by adding
assertions that verify the actual returned values in the result set in addition
to the row count check. For the first test, assert that all five names are
correctly projected in the nm column, and for the second test, assert that the
returned value is specifically 'Alice'. This ensures alias projections and
filtered data correctness, not just cardinality.
- Around line 374-391: The negative test assertions in with_node_out_of_scope
and with_node_passthrough_is_deferred functions only verify that an error
occurred with is_err(), but do not validate that the error contains the expected
reason or message. To fix this, inspect the actual error result by extracting
the error details and asserting that the error message contains the specific
expected text (e.g., that b is out of scope in the first test, and that
whole-node WITH is unsupported in the second test). Replace the generic is_err()
checks with assertions that validate the error kind or message matches the
intended contract to prevent false positives.
In `@crates/gf-ir/src/plan.rs`:
- Line 688: The `GraphOp::With` case at the `out_var: None` assignment is only
covering the terminal-RETURN shape of the IR contract. You need to add or modify
test coverage to exercise the non-None `out_var` round-trip scenario for the
WITH operation, ensuring that the `out_var` is preserved from the binder to the
lowerer when it has an actual value rather than None. This will validate that
the new IR contract properly handles cases where `out_var` is not None.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 73cda8e8-a71b-4b21-82e0-7b6dcb88e085
⛔ Files ignored due to path filters (13)
CHANGELOG.mdis excluded by!**/*.mdcrates/gf-ir/tests/ir_goldens/golden__filtered_scan.snapis excluded by!**/*.snapcrates/gf-ir/tests/ir_goldens/golden__named_path_fixed_segment_and_return_p.snapis excluded by!**/*.snapcrates/gf-ir/tests/ir_goldens/golden__named_path_variable_functions.snapis excluded by!**/*.snapcrates/gf-ir/tests/ir_goldens/golden__one_hop_expand.snapis excluded by!**/*.snapcrates/gf-ir/tests/ir_goldens/golden__optional_match.snapis excluded by!**/*.snapcrates/gf-ir/tests/ir_goldens/golden__order_by_limit.snapis excluded by!**/*.snapcrates/gf-ir/tests/ir_goldens/golden__parameter.snapis excluded by!**/*.snapcrates/gf-ir/tests/ir_goldens/golden__simple_node_scan.snapis excluded by!**/*.snapcrates/gf-ir/tests/ir_goldens/golden__two_hop_expand.snapis excluded by!**/*.snapcrates/gf-ir/tests/ir_goldens/golden__unwind.snapis excluded by!**/*.snapcrates/gf-ir/tests/ir_goldens/golden__variable_length_expand.snapis excluded by!**/*.snapcrates/gf-ir/tests/ir_goldens/golden__with_pipeline.snapis excluded by!**/*.snap
📒 Files selected for processing (6)
crates/gf-api/tests/e2e_baseline.rscrates/gf-ir/src/binder.rscrates/gf-ir/src/lib.rscrates/gf-ir/src/plan.rscrates/gf-ir/tests/golden.rscrates/gf-rel/src/lowerer.rs
…CodeRabbit) Two review fixes: - Recursive aggregate rejection: `expr_contains_aggregate` walks compound expressions (BinaryOp/UnaryOp/Parenthesized/FunctionCall args/Property/ List), so `WITH count(n) + 1 AS c` is rejected at bind time, not only a bare top-level `WITH count(n) AS c`. - Var-map scope reset: `lower_with_op` now clears the VarMap before installing the WITH aliases, so a dropped pre-WITH VarId can't resolve through a later lowering path — mirroring the binder's scope reset. Added `VarMap::clear`. New e2e `with_nested_aggregate_is_rejected`; all WITH e2e + IR goldens green; clippy --workspace -D warnings clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Part of #814 (slice 1 of 3 — the read path; does not close the issue).
Summary
A
MATCH … WITH … RETURNpipeline now projects/renames mid-query and chains. Before,GraphOp::Withhit "operator not yet lowered (deferred to #577+)" — which blocked a large share of corpus scenarios (it was the top failure when probing whole-node un-skips). This is the highest-leverage engine gap from that pivot.How
Binder (
lower_with) — WITH introduces a new scope:out_varfor its alias, then reset the scope to exactly those aliases (Cypher semantics);with_resets_scope_dropping_unprojected_varspins this);WITH … WHERE/ORDER BYlower against the new scope.IR —
ProjectItem.out_var(serde-default) links a WITH alias to the variable it introduces, so the lowerer can map it.Nonefor terminalRETURNitems.Lowerer (
lower_with_op) — project each item to its alias column, registerout_var → columnin the var-map (so downstream clauses resolve), then filter on theWITH WHEREover the projected columns.Correct-or-error (deferred, rejected — never silently wrong)
WITH(WITH n— a node spans multiple columns).WITH(WITH count(*) …).RETURNand mid-statement reads-after-writes — rust: reads after writes in one statement — WITH lowering and write-result RETURN #814's later slices.Validation
WHEREpost-projection, scope-reset error, node-passthrough deferral — 80 pass.out_var(mechanical) +with_pipelinere-pointed to a supported scalar WITH; gf-rel / gf-exec / gf-cli green;cargo clippy --workspace -- -D warningsclean; fmt clean.🤖 Generated with Claude Code
Note
Implement read-path WITH clause lowering and execution in the IR and relational layers
WITHclause now projects and renames mid-pipeline variables, resets scope to only the projected aliases, and supports downstreamWHERE/ORDER BY/SKIP/LIMIT/DISTINCT.lower_withrejects whole-node/path passthrough and any aggregation insideWITHitems with clear errors; valid items get a freshout_varallocated and the scope maps are rebuilt from the new aliases only.out_varfield onProjectItemcarries theVarId-to-alias mapping forWITHitems so the relational lowerer can register columns intoVarMap.lower_with_opprojects expressions to columns, clearsVarMap, re-registers aliases, and applies the optionalWHEREpredicate over the new scope.WITHfiltering, scope reset, and the two deferred/rejected cases (node passthrough, aggregation).Macroscope summarized 231809a.
Summary by CodeRabbit
WITHclause scoping so only explicitly projected variables are available to subsequent clauses.WITH+WHEREbehavior so filters use the projected aliases as expected.WITHusages and aggregation-in-WITHcases.WITHprojection rules for whole-node/path passthrough and required proper aliasing.WITHsemantics and updated plan/golden tests to match the new projection behavior.