Skip to content

feat(ir): lower and execute read-path WITH (#814 slice 1)#894

Merged
DecisionNerd merged 2 commits into
mainfrom
feature/814-with-lowering
Jun 23, 2026
Merged

feat(ir): lower and execute read-path WITH (#814 slice 1)#894
DecisionNerd merged 2 commits into
mainfrom
feature/814-with-lowering

Conversation

@DecisionNerd

@DecisionNerd DecisionNerd commented Jun 23, 2026

Copy link
Copy Markdown
Owner

Part of #814 (slice 1 of 3 — the read path; does not close the issue).

Summary

A MATCH … WITH … RETURN pipeline now projects/renames mid-query and chains. Before, GraphOp::With hit "operator not yet lowered (deferred to #577+)" — which blocked a large share of corpus scenarios (it was the top failure when probing whole-node un-skips). This is the highest-leverage engine gap from that pivot.

How

Binder (lower_with) — WITH introduces a new scope:

  • lower each item's expression in the current scope (it references upstream vars), mint a fresh out_var for its alias, then reset the scope to exactly those aliases (Cypher semantics);
  • a variable not carried forward is out of scope afterwards — referencing it is an error, not a silently-wrong result (with_resets_scope_dropping_unprojected_vars pins this);
  • WITH … WHERE / ORDER BY lower against the new scope.

IRProjectItem.out_var (serde-default) links a WITH alias to the variable it introduces, so the lowerer can map it. None for terminal RETURN items.

Lowerer (lower_with_op) — project each item to its alias column, register out_var → column in the var-map (so downstream clauses resolve), then filter on the WITH WHERE over the projected columns.

Correct-or-error (deferred, rejected — never silently wrong)

Validation

  • New e2e: project/rename, WHERE post-projection, scope-reset error, node-passthrough deferral — 80 pass.
  • IR goldens updated for out_var (mechanical) + with_pipeline re-pointed to a supported scalar WITH; gf-rel / gf-exec / gf-cli green; cargo clippy --workspace -- -D warnings clean; fmt clean.

🤖 Generated with Claude Code

Note

Implement read-path WITH clause lowering and execution in the IR and relational layers

  • The WITH clause now projects and renames mid-pipeline variables, resets scope to only the projected aliases, and supports downstream WHERE/ORDER BY/SKIP/LIMIT/DISTINCT.
  • In binder.rs, lower_with rejects whole-node/path passthrough and any aggregation inside WITH items with clear errors; valid items get a fresh out_var allocated and the scope maps are rebuilt from the new aliases only.
  • A new out_var field on ProjectItem carries the VarId-to-alias mapping for WITH items so the relational lowerer can register columns into VarMap.
  • In lowerer.rs, lower_with_op projects expressions to columns, clears VarMap, re-registers aliases, and applies the optional WHERE predicate over the new scope.
  • E2E tests cover alias projection, post-WITH filtering, scope reset, and the two deferred/rejected cases (node passthrough, aggregation).

Macroscope summarized 231809a.

Summary by CodeRabbit

  • Bug Fixes
    • Fixed WITH clause scoping so only explicitly projected variables are available to subsequent clauses.
    • Improved WITH + WHERE behavior so filters use the projected aliases as expected.
    • Added stricter validation for unsupported WITH usages and aggregation-in-WITH cases.
  • New Features
    • Enforced WITH projection rules for whole-node/path passthrough and required proper aliasing.
  • Tests
    • Added end-to-end coverage for WITH semantics and updated plan/golden tests to match the new projection behavior.

A `MATCH … WITH … RETURN` pipeline now projects/renames mid-query and
chains; before, `GraphOp::With` hit "operator not yet lowered".

Binder (lower_with):
- lower each WITH item's expression in the current scope, mint a fresh
  `out_var` for its alias, then RESET the scope to exactly those aliases
  (Cypher semantics) — a variable not carried forward is out of scope, so
  referencing it errors rather than returning a wrong result;
- a WITH WHERE / ORDER BY is lowered against the new scope.
- Guards (reject, don't mishandle): carrying a whole node/path through
  WITH (`WITH n` — a node spans multiple columns) and aggregation in WITH.

IR: `ProjectItem.out_var` (serde-default) links a WITH alias to the var
it introduces, so the lowerer can map it. `None` for terminal RETURN.

Lowerer (lower_with_op): project each item to its alias column, register
`out_var -> column` in the var-map, then filter on the WITH WHERE over
the projected columns.

Deferred to later #814 slices: whole-node/path pass-through, aggregation
in WITH, write-result RETURN, and mid-statement reads-after-writes.

Validated: new e2e (project/rename, WHERE post-projection, scope-reset
errors, node-passthrough deferral) — 80 pass; IR goldens updated for
`out_var` + the `with_pipeline` golden re-pointed to a scalar WITH;
gf-rel/gf-exec/gf-cli green; clippy --workspace -D warnings clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Walkthrough

ProjectItem in gf-ir gains an out_var: Option<VarId> field (defaulting to None for terminal RETURN items). The binder's lower_with is rewritten to validate WITH items, mint fresh out_var bindings, and reset the visible scope to only the projected aliases. A new lower_with_op helper in the relational lowerer consumes GraphOp::With, maps aliases into VarMap, and applies optional WHERE predicates. Tests are updated throughout.

Changes

WITH Scope Reset and Alias Propagation

Layer / File(s) Summary
ProjectItem out_var field and test fixups
crates/gf-ir/src/lib.rs, crates/gf-ir/src/plan.rs, crates/gf-rel/src/lowerer.rs
ProjectItem gains out_var: Option<VarId> annotated with #[serde(default)]; all test construction sites in plan round-trip tests and relational lowerer unit tests are updated to include out_var: None.
Binder lower_with rewrite and golden test
crates/gf-ir/src/binder.rs, crates/gf-ir/tests/golden.rs
lower_with validates WITH items (rejecting aggregation anywhere, whole-node, and path pass-through), enforces aliasing for non-variable expressions, lowers items in the upstream scope, assigns a fresh out_var per alias, clears all variable tracking, and repopulates only the WITH aliases before lowering WHERE/ORDER BY/SKIP/LIMIT. A new expr_contains_aggregate helper detects aggregates recursively. lower_return_items explicitly sets out_var: None. The with_pipeline golden test is updated to a property-alias query (WITH n.name AS nm).
Relational lowerer lower_with_op
crates/gf-rel/src/expr.rs, crates/gf-rel/src/lowerer.rs
VarMap gains a clear() method for scope reset. lower_op_with_arena dispatches GraphOp::With to a new lower_with_op helper that projects each item's expression using the current VarMap, clears and repopulates VarMap with each out_var alias, and applies an optional where_predicate filter after projection.
E2E tests for WITH semantics
crates/gf-api/tests/e2e_baseline.rs
Five new tests cover: alias passthrough into RETURN, WHERE filtering on a projected alias, scope reset rejecting unprojected variables, whole-node passthrough rejection at execution time, and nested aggregate expression rejection.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

Possibly related PRs

  • DecisionNerd/graphforge#687: Both PRs touch crates/gf-ir/src/binder.rs lowering layers; this PR extends binder work with the lower_with rewrite and alias projection logic.
  • DecisionNerd/graphforge#689: Added the with_pipeline snapshot harness in crates/gf-ir/tests/golden.rs that this PR updates to match the new WITH scoping semantics.
  • DecisionNerd/graphforge#890: Both PRs modify crates/gf-ir/src/binder.rs around lower_return_items and terminal item lowering; this PR's out_var: None terminal marking intersects with that PR's return-lowering changes.

Suggested labels

tests

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat(ir): lower and execute read-path WITH (#814 slice 1)' clearly summarizes the main change—implementing WITH clause lowering and execution—and references the tracked issue correctly.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed PR description is comprehensive and well-structured, covering objectives, implementation details, and validation. It clearly maps to most template sections with good technical depth.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/814-with-lowering

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
crates/gf-api/tests/e2e_baseline.rs (2)

347-365: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick win

Strengthen WITH happy-path assertions beyond row counts.

These two tests currently validate only cardinality, so alias/projection regressions can slip through if row counts still match. Assert returned values (and alias column) in addition to rows_produced.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/gf-api/tests/e2e_baseline.rs` around lines 347 - 365, The test
functions `with_projection_simple` and `with_where_filters_post_projection`
currently only validate row cardinality through `rows_produced` assertions,
which allows regressions in actual data values and column aliases to go
undetected. Strengthen both tests by adding assertions that verify the actual
returned values in the result set in addition to the row count check. For the
first test, assert that all five names are correctly projected in the nm column,
and for the second test, assert that the returned value is specifically 'Alice'.
This ensures alias projections and filtered data correctness, not just
cardinality.

374-391: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick win

Assert the expected error reason, not just is_err().

Both negative tests pass on any error type. Please assert the error message/kind reflects the intended contract (b out of scope after WITH, and whole-node WITH n unsupported) to avoid false positives.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/gf-api/tests/e2e_baseline.rs` around lines 374 - 391, The negative
test assertions in with_node_out_of_scope and with_node_passthrough_is_deferred
functions only verify that an error occurred with is_err(), but do not validate
that the error contains the expected reason or message. To fix this, inspect the
actual error result by extracting the error details and asserting that the error
message contains the specific expected text (e.g., that b is out of scope in the
first test, and that whole-node WITH is unsupported in the second test). Replace
the generic is_err() checks with assertions that validate the error kind or
message matches the intended contract to prevent false positives.
crates/gf-ir/src/plan.rs (1)

688-688: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Exercise the non-None out_var round-trip for WITH.

This GraphOp::With case is the new IR contract that should preserve out_var from binder to lowerer; keeping it None only covers the terminal-RETURN shape.

Test-contract tweak
-                    out_var: None,
+                    out_var: Some(v0),
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/gf-ir/src/plan.rs` at line 688, The `GraphOp::With` case at the
`out_var: None` assignment is only covering the terminal-RETURN shape of the IR
contract. You need to add or modify test coverage to exercise the non-None
`out_var` round-trip scenario for the WITH operation, ensuring that the
`out_var` is preserved from the binder to the lowerer when it has an actual
value rather than None. This will validate that the new IR contract properly
handles cases where `out_var` is not None.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/gf-ir/src/binder.rs`:
- Around line 759-765: The aggregate function check in the WITH clause
validation using agg_func_of(&item.expr) only detects top-level aggregate calls
and misses nested aggregates like count(n) + 1. Replace the current shallow
check with a recursive function that traverses the entire expression tree to
find aggregate functions at any depth, ensuring all aggregate usage patterns in
WITH clauses are properly rejected before reaching the lowerer.

In `@crates/gf-rel/src/lowerer.rs`:
- Around line 751-755: The VarMap is not being reset when installing the WITH
scope, which leaves pre-WITH variables resolvable in later lowering paths.
Before the loop that iterates through items and inserts aliases into var_map,
clear the var_map to remove all pre-WITH variable bindings. This ensures that
only the WITH clause aliases (from item.out_var and item.alias) remain in scope,
properly resetting the scope boundary as required by the binder contract.

---

Nitpick comments:
In `@crates/gf-api/tests/e2e_baseline.rs`:
- Around line 347-365: The test functions `with_projection_simple` and
`with_where_filters_post_projection` currently only validate row cardinality
through `rows_produced` assertions, which allows regressions in actual data
values and column aliases to go undetected. Strengthen both tests by adding
assertions that verify the actual returned values in the result set in addition
to the row count check. For the first test, assert that all five names are
correctly projected in the nm column, and for the second test, assert that the
returned value is specifically 'Alice'. This ensures alias projections and
filtered data correctness, not just cardinality.
- Around line 374-391: The negative test assertions in with_node_out_of_scope
and with_node_passthrough_is_deferred functions only verify that an error
occurred with is_err(), but do not validate that the error contains the expected
reason or message. To fix this, inspect the actual error result by extracting
the error details and asserting that the error message contains the specific
expected text (e.g., that b is out of scope in the first test, and that
whole-node WITH is unsupported in the second test). Replace the generic is_err()
checks with assertions that validate the error kind or message matches the
intended contract to prevent false positives.

In `@crates/gf-ir/src/plan.rs`:
- Line 688: The `GraphOp::With` case at the `out_var: None` assignment is only
covering the terminal-RETURN shape of the IR contract. You need to add or modify
test coverage to exercise the non-None `out_var` round-trip scenario for the
WITH operation, ensuring that the `out_var` is preserved from the binder to the
lowerer when it has an actual value rather than None. This will validate that
the new IR contract properly handles cases where `out_var` is not None.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 73cda8e8-a71b-4b21-82e0-7b6dcb88e085

📥 Commits

Reviewing files that changed from the base of the PR and between 536eae4 and db1f0b5.

⛔ Files ignored due to path filters (13)
  • CHANGELOG.md is excluded by !**/*.md
  • crates/gf-ir/tests/ir_goldens/golden__filtered_scan.snap is excluded by !**/*.snap
  • crates/gf-ir/tests/ir_goldens/golden__named_path_fixed_segment_and_return_p.snap is excluded by !**/*.snap
  • crates/gf-ir/tests/ir_goldens/golden__named_path_variable_functions.snap is excluded by !**/*.snap
  • crates/gf-ir/tests/ir_goldens/golden__one_hop_expand.snap is excluded by !**/*.snap
  • crates/gf-ir/tests/ir_goldens/golden__optional_match.snap is excluded by !**/*.snap
  • crates/gf-ir/tests/ir_goldens/golden__order_by_limit.snap is excluded by !**/*.snap
  • crates/gf-ir/tests/ir_goldens/golden__parameter.snap is excluded by !**/*.snap
  • crates/gf-ir/tests/ir_goldens/golden__simple_node_scan.snap is excluded by !**/*.snap
  • crates/gf-ir/tests/ir_goldens/golden__two_hop_expand.snap is excluded by !**/*.snap
  • crates/gf-ir/tests/ir_goldens/golden__unwind.snap is excluded by !**/*.snap
  • crates/gf-ir/tests/ir_goldens/golden__variable_length_expand.snap is excluded by !**/*.snap
  • crates/gf-ir/tests/ir_goldens/golden__with_pipeline.snap is excluded by !**/*.snap
📒 Files selected for processing (6)
  • crates/gf-api/tests/e2e_baseline.rs
  • crates/gf-ir/src/binder.rs
  • crates/gf-ir/src/lib.rs
  • crates/gf-ir/src/plan.rs
  • crates/gf-ir/tests/golden.rs
  • crates/gf-rel/src/lowerer.rs

Comment thread crates/gf-ir/src/binder.rs Outdated
Comment thread crates/gf-rel/src/lowerer.rs Outdated
…CodeRabbit)

Two review fixes:

- Recursive aggregate rejection: `expr_contains_aggregate` walks compound
  expressions (BinaryOp/UnaryOp/Parenthesized/FunctionCall args/Property/
  List), so `WITH count(n) + 1 AS c` is rejected at bind time, not only a
  bare top-level `WITH count(n) AS c`.
- Var-map scope reset: `lower_with_op` now clears the VarMap before
  installing the WITH aliases, so a dropped pre-WITH VarId can't resolve
  through a later lowering path — mirroring the binder's scope reset.
  Added `VarMap::clear`.

New e2e `with_nested_aggregate_is_rejected`; all WITH e2e + IR goldens
green; clippy --workspace -D warnings clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@DecisionNerd DecisionNerd merged commit 62e4d35 into main Jun 23, 2026
41 checks passed
@DecisionNerd DecisionNerd deleted the feature/814-with-lowering branch June 23, 2026 20:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant