You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Diagram is now an inspection-only tool. delete() and drop() have been
moved to Table. Updated diagram spec, whats-new-22, and delete-data
how-to to reflect this change.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy file name to clipboardExpand all lines: src/explanation/whats-new-22.md
+29-11Lines changed: 29 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
# What's New in DataJoint 2.2
2
2
3
-
DataJoint 2.2 introduces **isolated instances**, **thread-safe mode**, and **graph-driven diagram operations**for applications that need multiple independent database connections, explicit cascade control, and operational use of the dependency graph.
3
+
DataJoint 2.2 introduces **isolated instances** and **thread-safe mode** for applications that need multiple independent database connections, and **graph-driven diagram operations** that replace the legacy error-driven cascade with a reliable, inspectable approach for all users.
4
4
5
5
> **Upgrading from 2.0 or 2.1?** No breaking changes. All existing code using `dj.config` and `dj.Schema()` continues to work. The new Instance API is purely additive.
6
6
@@ -207,27 +207,29 @@ DataJoint 2.2 promotes `dj.Diagram` from a visualization tool to an operational
207
207
208
208
### From Visualization to Operations
209
209
210
-
In prior versions, `dj.Diagram` existed solely for visualization — drawing the dependency graph as SVG or Mermaid output. The cascade logic inside `Table.delete()` traversed dependencies independently, with no way to inspect or control the cascade before it executed.
210
+
In prior versions, `dj.Diagram` existed solely for visualization — drawing the dependency graph as SVG or Mermaid output. The cascade logic inside `Table.delete()` traversed dependencies independently using an error-driven approach: attempt `DELETE` on the parent, catch the foreign key integrity error, parse the error message to discover which child table is blocking, then recursively delete from that child first. This had several problems:
211
211
212
-
In 2.2, `Table.delete()` and `Table.drop()` delegate internally to `dj.Diagram`. The user-facing behavior of `Table.delete()` is unchanged, but the diagram-level API is now available as a more powerful interface for complex scenarios.
212
+
-**MySQL 8 with limited privileges** returns error 1217 (`ROW_IS_REFERENCED`) instead of 1451 (`ROW_IS_REFERENCED_2`), which provides no table name — the cascade crashes with no way to proceed.
213
+
-**PostgreSQL** aborts the entire transaction on any error, requiring `SAVEPOINT` / `ROLLBACK TO SAVEPOINT` round-trips for each failed delete attempt.
214
+
-**Fragile error parsing** across MySQL versions and privilege levels, where different configurations produce different error message formats.
215
+
216
+
In 2.2, `Table.delete()` and `Table.drop()` use `dj.Diagram` internally to compute the dependency graph and walk it in reverse topological order — deleting leaves first, with no trial-and-error needed. The user-facing behavior of `Table.delete()` is unchanged. The Diagram's `cascade()` and `preview()` methods are available as a public inspection API for understanding cascade impact before executing.
213
217
214
218
### The Preview-Then-Execute Pattern
215
219
216
-
The key benefit of the diagram-level API is the ability to build a cascade explicitly, inspect it, and then decide whether to execute:
220
+
The key benefit of the diagram-level API is the ability to build a cascade explicitly, inspect it, and then execute via `Table.delete()`:
217
221
218
222
```python
219
-
# Build the dependency graph
223
+
# Build the dependency graph and inspect the cascade
220
224
diag = dj.Diagram(schema)
221
-
222
-
# Apply cascade restriction — nothing is deleted yet
This is valuable when working with unfamiliar pipelines, large datasets, or multi-schema dependencies where the cascade impact is not immediately obvious.
@@ -238,9 +240,11 @@ The diagram supports two restriction propagation modes designed for fundamentall
238
240
239
241
**`cascade()` prepares a delete.** It takes a single restricted table expression, propagates the restriction downstream through all descendants, and **trims the diagram** to the resulting subgraph — ancestors and unrelated tables are removed entirely. Convergence uses OR: a descendant row is marked for deletion if *any* ancestor path reaches it, because if any reason exists to remove a row, it should be removed. `cascade()` is one-shot and is always followed by `preview()` or `delete()`.
240
242
243
+
When the cascade encounters a part table whose master is not yet included in the cascade, the behavior depends on the `part_integrity` setting. With `"enforce"` (the default), `delete()` raises an error if part rows would be deleted without their master — preventing orphaned master rows. With `"cascade"`, the restriction propagates *upward* from the part to its master: the restricted part rows identify which master rows are affected, those masters receive a restriction, and that restriction then propagates back downstream to all sibling parts — deleting the entire compositional unit, not just the originally matched part rows.
244
+
241
245
**`restrict()` selects a data subset.** It propagates a restriction downstream but **preserves the full diagram**, allowing `restrict()` to be called again from a different seed table. This makes it possible to build up multi-condition subsets incrementally — for example, restricting by species from one table and by date from another. Convergence uses AND: a descendant row is included only if *all* restricted ancestors match, because an export should contain only rows satisfying every condition. After chaining restrictions, use `prune()` to remove empty tables and `preview()` to inspect the result.
242
246
243
-
The two modes are mutually exclusive on the same diagram. This prevents accidental mixing of incompatible semantics — a delete diagram should never be reused for subsetting, and vice versa.
247
+
The two modes are mutually exclusive on the same diagram — DataJoint raises an error if you attempt to mix `cascade()` and `restrict()`, or if you call `cascade()` more than once. This prevents accidental mixing of incompatible semantics: a delete diagram should never be reused for subsetting, and vice versa.
244
248
245
249
### Pruning Empty Tables
246
250
@@ -260,7 +264,21 @@ Without prior restrictions, `prune()` removes physically empty tables. This is u
260
264
261
265
### Architecture
262
266
263
-
`Table.delete()` now constructs a `Diagram` internally, calls `cascade()`, and then `delete()`. This means every table-level delete benefits from the same graph-driven logic. The diagram-level API simply exposes this machinery for direct use when more control is needed.
267
+
`Table.delete()` constructs a `Diagram` internally, calls `cascade()` to compute the affected subgraph, then executes the delete itself in reverse topological order. The Diagram is purely a graph computation and inspection tool — it computes the cascade and provides `preview()`, but all mutation logic (transactions, SQL execution, prompts) lives in `Table.delete()` and `Table.drop()`.
268
+
269
+
### Advantages over Error-Driven Cascade
270
+
271
+
The graph-driven approach resolves every known limitation of the prior error-driven cascade:
Cascade inspection via `dj.Diagram` was added in DataJoint 2.2.
196
196
197
-
For complex scenarios — previewing the blast radius, working across schemas, or understanding the dependency graph before deleting — use `dj.Diagram` to build and inspect the cascade before executing.
197
+
For a quick preview, `table.delete(dry_run=True)` returns the affected row counts without deleting anything:
For more complex scenarios — working across schemas, chaining multiple restrictions, or visualizing the dependency graph — use `dj.Diagram` to build and inspect the cascade explicitly:
200
206
201
207
```python
202
208
import datajoint as dj
203
209
204
-
# 1. Build the dependency graph
210
+
# 1. Build the dependency graph and apply cascade restriction
-**Preview blast radius**: Understand what a cascade delete will affect before committing
221
-
-**Multi-schema cascades**: Build a diagram spanning multiple schemas and delete across them in one operation
228
+
-**Multi-schema inspection**: Build a diagram spanning multiple schemas to visualize cascade impact
222
229
-**Programmatic control**: Use `preview()` return values to make decisions in automated workflows
223
230
224
-
For simple single-table deletes, `(Table & restriction).delete()` remains the simplest approach. The diagram-level API is for when you need more visibility or control.
231
+
For simple single-table deletes, `(Table & restriction).delete()` remains the simplest approach. The diagram API is for when you need more visibility before executing.
Operational methods (`cascade`, `restrict`, `delete`, `drop`, `preview`, `prune`) were added in DataJoint 2.2.
123
+
Operational methods (`cascade`, `restrict`, `preview`, `prune`) were added in DataJoint 2.2.
124
124
125
-
Diagrams can propagate restrictions through the dependency graph and execute data operations (delete, drop) using the graph structure. These methods turn Diagram from a visualization tool into an operational component.
125
+
Diagrams can propagate restrictions through the dependency graph and inspect affected data using the graph structure. These methods turn Diagram from a visualization tool into a graph computation and inspection component. All mutation operations (delete, drop) are executed by `Table.delete()` and `Table.drop()`, which use Diagram internally.
Execute a cascading delete on the cascade subgraph. All tables in the diagram are deleted in reverse topological order (leaves first) to maintain referential integrity.
199
-
200
-
| Parameter | Type | Default | Description |
201
-
|-----------|------|---------|-------------|
202
-
|`transaction`| bool |`True`| Wrap in atomic transaction |
203
-
|`prompt`| bool or None |`None`| Prompt for confirmation. Default: `dj.config['safemode']`|
204
-
|`dry_run`| bool |`False`| If `True`, return affected row counts without deleting |
205
-
206
-
**Returns:**`int` (rows deleted from root table) or `dict[str, int]` (table → row count mapping when `dry_run=True`).
**Note:** Unlike `delete()`, `drop()` does not use cascade restrictions. It drops all tables in the diagram.
234
-
235
192
### `preview()`
236
193
237
194
```python
@@ -257,7 +214,7 @@ counts = restricted.preview()
257
214
diag.prune()
258
215
```
259
216
260
-
Remove tables with zero matching rows from the diagram. Without prior restrictions, removes physically empty tables. With restrictions (`cascade()` or `restrict()`), removes tables where the restricted query yields zero rows.
217
+
Remove tables with zero matching rows from the diagram view. This only affects the diagram object — no tables or data are modified in the database. Without prior restrictions, removes physically empty tables from the diagram. With restrictions (`cascade()` or `restrict()`), removes tables where the restricted query yields zero rows.
261
218
262
219
**Returns:** New `Diagram` with empty tables removed.
263
220
@@ -291,6 +248,14 @@ When a child table has multiple restricted ancestors, the convergence rule depen
291
248
-**`cascade()` (OR):** A child row is affected if *any* path from a restricted ancestor reaches it. This is appropriate for delete — if any reason exists to delete a row, it should be deleted.
292
249
-**`restrict()` (AND):** A child row is included only if *all* restricted ancestors match. This is appropriate for export — only rows satisfying every condition are selected.
293
250
251
+
**Multiple foreign keys to the same parent:**
252
+
253
+
When a child table references the same parent through multiple foreign keys (e.g., `source_mouse` and `target_mouse` both referencing `Mouse`), these paths always combine with **OR** regardless of the propagation mode. Each foreign key path is an independent reason for the child row to be affected — this is structural, not operation-dependent.
254
+
255
+
**Unloaded schemas:**
256
+
257
+
If a descendant table lives in a schema that hasn't been activated (loaded into the dependency graph), the graph-driven delete won't know about it. The final `DELETE` on the parent will fail with a foreign key error. DataJoint catches this and produces an actionable error message identifying which schema needs to be activated.
0 commit comments