You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
docs: add restriction semantics for non-downstream nodes and OR/AND
- Unrestricted nodes are not affected by operations
- Multiple restrict() calls create separate restriction sets
- Delete combines sets with OR (any taint → delete)
- Export combines sets with AND (all criteria → include)
- Within a set, multiple FK paths combine with OR (structural)
- Added open questions on lenient vs strict AND and same-table restrictions
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
A restricted diagram distinguishes between three kinds of nodes:
63
+
64
+
1.**Directly restricted** — the user applied a restriction to this node
65
+
2.**Indirectly restricted** — a restriction propagated to this node from an ancestor
66
+
3.**Unrestricted** — no restriction reached this node
67
+
68
+
**Only restricted nodes (direct or indirect) participate in operations.** Unrestricted nodes are left untouched. This is critical for delete: if you restrict `Session & 'subject=1'`, only `Session` and its downstream dependents are affected. Tables in the diagram that are not downstream of `Session` (e.g., `Equipment`, `Lab`) are not deleted.
69
+
70
+
The restricted diagram's `topo_sort()` for operations should only yield nodes that carry a restriction. Unrestricted nodes are effectively invisible to the operation.
71
+
72
+
### Multiple restrictions: OR vs AND
73
+
74
+
When multiple restrictions are applied to different tables in the diagram, downstream nodes may receive restrictions from multiple parents. How these combine depends on the operation.
75
+
76
+
**Example:** A diagram with `Lab`, `Session → Recording`. `Recording` depends on both `Session` and `Lab`.
77
+
78
+
```python
79
+
rd = dj.Diagram(schema)
80
+
rd.restrict(Session &'subject=1') # R1 propagates to Recording
81
+
rd.restrict(Lab &'lab="brody"') # R2 propagates to Recording
82
+
```
83
+
84
+
Recording now has two propagated restrictions:
85
+
- R1: rows referencing subject=1 sessions
86
+
- R2: rows referencing brody lab
87
+
88
+
**For delete (OR / union):** A recording should be deleted if it is tainted by *any* restricted parent. Deleting subject 1 means all their recordings go, regardless of which lab. Deleting brody lab means all their recordings go, regardless of subject. The two restrictions combine with OR.
89
+
90
+
**For export/publish (AND / intersection):** A recording should be exported only if it satisfies *all* criteria. You want specifically brody lab's subject 1 recordings. The two restrictions combine with AND.
91
+
92
+
**Implementation:** The diagram stores restrictions as separate **restriction sets**, one per `restrict()` call. Each set propagates independently. The combination logic is deferred to the operation:
93
+
94
+
```python
95
+
classRestrictedDiagram:
96
+
# Each restrict() call creates a new restriction set.
97
+
# A restriction set is a dict mapping table_name → list[restriction]
98
+
# (list = OR within a set, for multiple FK paths from different parents)
99
+
_restriction_sets: list[dict[str, list]]
100
+
101
+
defrestrict(self, table_expr):
102
+
"""Add a new restriction set. Propagate downstream."""
Within a single restriction set, multiple restrictions at the same node (from different FK paths) are always OR — a row that references a restricted parent through *any* FK is affected. This is structural and operation-independent.
152
+
153
+
*Across* restriction sets (separate `restrict()` calls on different tables), the combination depends on the operation. The diagram stores them separately and lets the operation choose.
154
+
155
+
**Edge case — node restricted in some sets but not others:**
156
+
157
+
For AND mode (export): if a node is downstream of restriction set R1 but not R2, what happens? The node has restrictions from R1 but none from R2. Two options:
158
+
-**Strict AND**: node is excluded (no data exported) because it doesn't satisfy all criteria
159
+
-**Lenient AND**: only apply AND across sets that actually reach this node
160
+
161
+
Lenient AND is more useful: `restrict(Session & 'subject=1') & restrict(Stimulus & 'type="visual"')` should export recordings that are from subject 1 AND use visual stimuli, but should also export the `Session` rows for subject 1 even though `Stimulus` restrictions don't propagate up to `Session`. The lenient interpretation applies AND only where multiple restriction sets converge.
162
+
60
163
### Restriction propagation
61
164
62
-
Each node in the `RestrictedDiagram`carries a list of restrictions (combined with OR for multiple FK paths from different parents).
165
+
Each restriction set propagates independently through the graph. Within a set, each node carries a list of restrictions (OR-combined for multiple FK paths).
63
166
64
167
**Propagation rules for edge `Parent → Child` with `attr_map`:**
65
168
@@ -70,7 +173,7 @@ Each node in the `RestrictedDiagram` carries a list of restrictions (combined wi
70
173
Restrict child by `parent.proj(**{fk: pk for fk, pk in attr_map.items()})`.
71
174
72
175
3.**Multiple FK paths to the same child** (via alias nodes):
73
-
Each path produces a separate restriction. These combine with OR — a child row must be deleted if it references restricted parent rows through *any* FK.
176
+
Each path produces a separate restriction within the same set. These combine with OR — a child row is affected if it references restricted parent rows through *any* FK.
74
177
75
178
This reuses the existing restriction logic from the current `cascade()` function (lines 1082–1090 of `table.py`), but applies it upfront during graph traversal rather than reactively from error messages.
Eager: propagate all restrictions when `restrict()` is called (computes row counts immediately).
291
396
Lazy: store parent restrictions and propagate during `delete()`/`export()` (defers queries).
292
397
Eager is better for preview but may issue many queries upfront. Lazy is more efficient when the user just wants to delete without preview. Consider lazy propagation with eager option for preview.
398
+
399
+
5.**Lenient vs strict AND for export:**
400
+
When using AND mode across restriction sets, a node may be downstream of some restriction sets but not others. Lenient AND (apply intersection only where sets converge) is more practical but harder to reason about. Strict AND (node must be restricted by all sets) is simpler but may be too aggressive. Need to validate with real export use cases.
401
+
402
+
6.**Restricting the same table in multiple `restrict()` calls:**
403
+
If the user calls `rd.restrict(Session & 'subject=1')` then `rd.restrict(Session & 'subject=2')`, these become two restriction sets. For delete (OR): deletes subject 1 and subject 2. For export (AND): exports rows that are somehow both subject 1 and 2 (empty set). Should restricting the same table in multiple calls be treated specially — perhaps accumulating within a single set instead?
0 commit comments