Skip to content

Commit 187408b

Browse files
docs: remove delete/drop from Diagram public API
Diagram is now an inspection-only tool. delete() and drop() have been moved to Table. Updated diagram spec, whats-new-22, and delete-data how-to to reflect this change. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 6c9291d commit 187408b

3 files changed

Lines changed: 61 additions & 71 deletions

File tree

src/explanation/whats-new-22.md

Lines changed: 29 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# What's New in DataJoint 2.2
22

3-
DataJoint 2.2 introduces **isolated instances**, **thread-safe mode**, and **graph-driven diagram operations** for applications that need multiple independent database connections, explicit cascade control, and operational use of the dependency graph.
3+
DataJoint 2.2 introduces **isolated instances** and **thread-safe mode** for applications that need multiple independent database connections, and **graph-driven diagram operations** that replace the legacy error-driven cascade with a reliable, inspectable approach for all users.
44

55
> **Upgrading from 2.0 or 2.1?** No breaking changes. All existing code using `dj.config` and `dj.Schema()` continues to work. The new Instance API is purely additive.
66
@@ -207,27 +207,29 @@ DataJoint 2.2 promotes `dj.Diagram` from a visualization tool to an operational
207207

208208
### From Visualization to Operations
209209

210-
In prior versions, `dj.Diagram` existed solely for visualization — drawing the dependency graph as SVG or Mermaid output. The cascade logic inside `Table.delete()` traversed dependencies independently, with no way to inspect or control the cascade before it executed.
210+
In prior versions, `dj.Diagram` existed solely for visualization — drawing the dependency graph as SVG or Mermaid output. The cascade logic inside `Table.delete()` traversed dependencies independently using an error-driven approach: attempt `DELETE` on the parent, catch the foreign key integrity error, parse the error message to discover which child table is blocking, then recursively delete from that child first. This had several problems:
211211

212-
In 2.2, `Table.delete()` and `Table.drop()` delegate internally to `dj.Diagram`. The user-facing behavior of `Table.delete()` is unchanged, but the diagram-level API is now available as a more powerful interface for complex scenarios.
212+
- **MySQL 8 with limited privileges** returns error 1217 (`ROW_IS_REFERENCED`) instead of 1451 (`ROW_IS_REFERENCED_2`), which provides no table name — the cascade crashes with no way to proceed.
213+
- **PostgreSQL** aborts the entire transaction on any error, requiring `SAVEPOINT` / `ROLLBACK TO SAVEPOINT` round-trips for each failed delete attempt.
214+
- **Fragile error parsing** across MySQL versions and privilege levels, where different configurations produce different error message formats.
215+
216+
In 2.2, `Table.delete()` and `Table.drop()` use `dj.Diagram` internally to compute the dependency graph and walk it in reverse topological order — deleting leaves first, with no trial-and-error needed. The user-facing behavior of `Table.delete()` is unchanged. The Diagram's `cascade()` and `preview()` methods are available as a public inspection API for understanding cascade impact before executing.
213217

214218
### The Preview-Then-Execute Pattern
215219

216-
The key benefit of the diagram-level API is the ability to build a cascade explicitly, inspect it, and then decide whether to execute:
220+
The key benefit of the diagram-level API is the ability to build a cascade explicitly, inspect it, and then execute via `Table.delete()`:
217221

218222
```python
219-
# Build the dependency graph
223+
# Build the dependency graph and inspect the cascade
220224
diag = dj.Diagram(schema)
221-
222-
# Apply cascade restriction — nothing is deleted yet
223225
restricted = diag.cascade(Session & {'subject_id': 'M001'})
224226

225227
# Inspect: what tables and how many rows would be affected?
226228
counts = restricted.preview()
227229
# {'`lab`.`session`': 3, '`lab`.`trial`': 45, '`lab`.`processed_data`': 45}
228230

229-
# Execute only after reviewing the blast radius
230-
restricted.delete(prompt=False)
231+
# Execute via Table.delete() after reviewing the blast radius
232+
(Session & {'subject_id': 'M001'}).delete(prompt=False)
231233
```
232234

233235
This is valuable when working with unfamiliar pipelines, large datasets, or multi-schema dependencies where the cascade impact is not immediately obvious.
@@ -238,9 +240,11 @@ The diagram supports two restriction propagation modes designed for fundamentall
238240

239241
**`cascade()` prepares a delete.** It takes a single restricted table expression, propagates the restriction downstream through all descendants, and **trims the diagram** to the resulting subgraph — ancestors and unrelated tables are removed entirely. Convergence uses OR: a descendant row is marked for deletion if *any* ancestor path reaches it, because if any reason exists to remove a row, it should be removed. `cascade()` is one-shot and is always followed by `preview()` or `delete()`.
240242

243+
When the cascade encounters a part table whose master is not yet included in the cascade, the behavior depends on the `part_integrity` setting. With `"enforce"` (the default), `delete()` raises an error if part rows would be deleted without their master — preventing orphaned master rows. With `"cascade"`, the restriction propagates *upward* from the part to its master: the restricted part rows identify which master rows are affected, those masters receive a restriction, and that restriction then propagates back downstream to all sibling parts — deleting the entire compositional unit, not just the originally matched part rows.
244+
241245
**`restrict()` selects a data subset.** It propagates a restriction downstream but **preserves the full diagram**, allowing `restrict()` to be called again from a different seed table. This makes it possible to build up multi-condition subsets incrementally — for example, restricting by species from one table and by date from another. Convergence uses AND: a descendant row is included only if *all* restricted ancestors match, because an export should contain only rows satisfying every condition. After chaining restrictions, use `prune()` to remove empty tables and `preview()` to inspect the result.
242246

243-
The two modes are mutually exclusive on the same diagram. This prevents accidental mixing of incompatible semantics a delete diagram should never be reused for subsetting, and vice versa.
247+
The two modes are mutually exclusive on the same diagram — DataJoint raises an error if you attempt to mix `cascade()` and `restrict()`, or if you call `cascade()` more than once. This prevents accidental mixing of incompatible semantics: a delete diagram should never be reused for subsetting, and vice versa.
244248

245249
### Pruning Empty Tables
246250

@@ -260,7 +264,21 @@ Without prior restrictions, `prune()` removes physically empty tables. This is u
260264

261265
### Architecture
262266

263-
`Table.delete()` now constructs a `Diagram` internally, calls `cascade()`, and then `delete()`. This means every table-level delete benefits from the same graph-driven logic. The diagram-level API simply exposes this machinery for direct use when more control is needed.
267+
`Table.delete()` constructs a `Diagram` internally, calls `cascade()` to compute the affected subgraph, then executes the delete itself in reverse topological order. The Diagram is purely a graph computation and inspection tool — it computes the cascade and provides `preview()`, but all mutation logic (transactions, SQL execution, prompts) lives in `Table.delete()` and `Table.drop()`.
268+
269+
### Advantages over Error-Driven Cascade
270+
271+
The graph-driven approach resolves every known limitation of the prior error-driven cascade:
272+
273+
| Scenario | Error-driven (prior) | Graph-driven (2.2) |
274+
|---|---|---|
275+
| MySQL 8 + limited privileges | Crashes (error 1217, no table name) | Works — no error parsing needed |
276+
| PostgreSQL | Savepoint overhead per attempt | No errors triggered |
277+
| Multiple FKs to same child | One-at-a-time via retry loop | All paths resolved upfront |
278+
| Part integrity enforcement | Post-hoc check after delete | Data-driven post-check (no false positives) |
279+
| Unloaded schemas | Crash with opaque error | Clear error: "activate schema X" |
280+
| Reusability | Delete-only | Delete, drop, export, prune |
281+
| Inspectability | Opaque recursive cascade | `preview()` / `dry_run` before executing |
264282

265283
## See Also
266284

src/how-to/delete-data.md

Lines changed: 19 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -189,39 +189,46 @@ count = (Subject & restriction).delete(prompt=False)
189189
print(f"Deleted {count} subjects")
190190
```
191191

192-
## Diagram-Level Delete
192+
## Inspecting Cascade Before Deleting
193193

194194
!!! version-added "New in 2.2"
195-
Diagram-level delete was added in DataJoint 2.2.
195+
Cascade inspection via `dj.Diagram` was added in DataJoint 2.2.
196196

197-
For complex scenarios — previewing the blast radius, working across schemas, or understanding the dependency graph before deleting — use `dj.Diagram` to build and inspect the cascade before executing.
197+
For a quick preview, `table.delete(dry_run=True)` returns the affected row counts without deleting anything:
198198

199-
### Build, Preview, Execute
199+
```python
200+
# Quick preview of what would be deleted
201+
(Session & {'subject_id': 'M001'}).delete(dry_run=True)
202+
# {'`lab`.`session`': 3, '`lab`.`trial`': 45, '`lab`.`processed_data`': 45}
203+
```
204+
205+
For more complex scenarios — working across schemas, chaining multiple restrictions, or visualizing the dependency graph — use `dj.Diagram` to build and inspect the cascade explicitly:
200206

201207
```python
202208
import datajoint as dj
203209

204-
# 1. Build the dependency graph
210+
# 1. Build the dependency graph and apply cascade restriction
205211
diag = dj.Diagram(schema)
206-
207-
# 2. Apply cascade restriction (nothing deleted yet)
208212
restricted = diag.cascade(Session & {'subject_id': 'M001'})
209213

210-
# 3. Preview: see affected tables and row counts
214+
# 2. Preview: see affected tables and row counts
211215
counts = restricted.preview()
212216
# {'`lab`.`session`': 3, '`lab`.`trial`': 45, '`lab`.`processed_data`': 45}
213217

214-
# 4. Execute only after reviewing
215-
restricted.delete(prompt=False)
218+
# 3. Visualize the cascade subgraph (in Jupyter)
219+
restricted
220+
221+
# 4. Execute via Table.delete() after reviewing
222+
(Session & {'subject_id': 'M001'}).delete(prompt=False)
216223
```
217224

218225
### When to Use
219226

220227
- **Preview blast radius**: Understand what a cascade delete will affect before committing
221-
- **Multi-schema cascades**: Build a diagram spanning multiple schemas and delete across them in one operation
228+
- **Multi-schema inspection**: Build a diagram spanning multiple schemas to visualize cascade impact
222229
- **Programmatic control**: Use `preview()` return values to make decisions in automated workflows
223230

224-
For simple single-table deletes, `(Table & restriction).delete()` remains the simplest approach. The diagram-level API is for when you need more visibility or control.
231+
For simple single-table deletes, `(Table & restriction).delete()` remains the simplest approach. The diagram API is for when you need more visibility before executing.
225232

226233
## See Also
227234

src/reference/specs/diagram.md

Lines changed: 13 additions & 48 deletions
Original file line numberDiff line numberDiff line change
@@ -120,9 +120,9 @@ dj.Diagram(Subject) + dj.Diagram(analysis).collapse()
120120
## Operational Methods
121121

122122
!!! version-added "New in 2.2"
123-
Operational methods (`cascade`, `restrict`, `delete`, `drop`, `preview`, `prune`) were added in DataJoint 2.2.
123+
Operational methods (`cascade`, `restrict`, `preview`, `prune`) were added in DataJoint 2.2.
124124

125-
Diagrams can propagate restrictions through the dependency graph and execute data operations (delete, drop) using the graph structure. These methods turn Diagram from a visualization tool into an operational component.
125+
Diagrams can propagate restrictions through the dependency graph and inspect affected data using the graph structure. These methods turn Diagram from a visualization tool into a graph computation and inspection component. All mutation operations (delete, drop) are executed by `Table.delete()` and `Table.drop()`, which use Diagram internally.
126126

127127
### `cascade()`
128128

@@ -189,49 +189,6 @@ restricted = (diag
189189
.restrict(Session & 'session_date > "2024-01-01"'))
190190
```
191191

192-
### `delete()`
193-
194-
```python
195-
diag.delete(transaction=True, prompt=None, dry_run=False)
196-
```
197-
198-
Execute a cascading delete on the cascade subgraph. All tables in the diagram are deleted in reverse topological order (leaves first) to maintain referential integrity.
199-
200-
| Parameter | Type | Default | Description |
201-
|-----------|------|---------|-------------|
202-
| `transaction` | bool | `True` | Wrap in atomic transaction |
203-
| `prompt` | bool or None | `None` | Prompt for confirmation. Default: `dj.config['safemode']` |
204-
| `dry_run` | bool | `False` | If `True`, return affected row counts without deleting |
205-
206-
**Returns:** `int` (rows deleted from root table) or `dict[str, int]` (table → row count mapping when `dry_run=True`).
207-
208-
**Requires:** `cascade()` must be called first.
209-
210-
```python
211-
diag = dj.Diagram(schema)
212-
restricted = diag.cascade(Session & {'subject_id': 'M001'})
213-
restricted.preview() # inspect what will be deleted
214-
restricted.delete() # execute the delete
215-
```
216-
217-
### `drop()`
218-
219-
```python
220-
diag.drop(prompt=None, part_integrity="enforce", dry_run=False)
221-
```
222-
223-
Drop all tables in the diagram in reverse topological order.
224-
225-
| Parameter | Type | Default | Description |
226-
|-----------|------|---------|-------------|
227-
| `prompt` | bool or None | `None` | Prompt for confirmation. Default: `dj.config['safemode']` |
228-
| `part_integrity` | str | `"enforce"` | `"enforce"` or `"ignore"` |
229-
| `dry_run` | bool | `False` | If `True`, return row counts without dropping tables |
230-
231-
**Returns:** `dict[str, int]` (table → row count mapping when `dry_run=True`). Returns `None` otherwise.
232-
233-
**Note:** Unlike `delete()`, `drop()` does not use cascade restrictions. It drops all tables in the diagram.
234-
235192
### `preview()`
236193

237194
```python
@@ -257,7 +214,7 @@ counts = restricted.preview()
257214
diag.prune()
258215
```
259216

260-
Remove tables with zero matching rows from the diagram. Without prior restrictions, removes physically empty tables. With restrictions (`cascade()` or `restrict()`), removes tables where the restricted query yields zero rows.
217+
Remove tables with zero matching rows from the diagram view. This only affects the diagram object — no tables or data are modified in the database. Without prior restrictions, removes physically empty tables from the diagram. With restrictions (`cascade()` or `restrict()`), removes tables where the restricted query yields zero rows.
261218

262219
**Returns:** New `Diagram` with empty tables removed.
263220

@@ -291,6 +248,14 @@ When a child table has multiple restricted ancestors, the convergence rule depen
291248
- **`cascade()` (OR):** A child row is affected if *any* path from a restricted ancestor reaches it. This is appropriate for delete — if any reason exists to delete a row, it should be deleted.
292249
- **`restrict()` (AND):** A child row is included only if *all* restricted ancestors match. This is appropriate for export — only rows satisfying every condition are selected.
293250

251+
**Multiple foreign keys to the same parent:**
252+
253+
When a child table references the same parent through multiple foreign keys (e.g., `source_mouse` and `target_mouse` both referencing `Mouse`), these paths always combine with **OR** regardless of the propagation mode. Each foreign key path is an independent reason for the child row to be affected — this is structural, not operation-dependent.
254+
255+
**Unloaded schemas:**
256+
257+
If a descendant table lives in a schema that hasn't been activated (loaded into the dependency graph), the graph-driven delete won't know about it. The final `DELETE` on the parent will fail with a foreign key error. DataJoint catches this and produces an actionable error message identifying which schema needs to be activated.
258+
294259
---
295260

296261
## Output Methods
@@ -475,7 +440,7 @@ combined = dj.Diagram.from_sequence([schema1, schema2, schema3])
475440

476441
## Dependencies
477442

478-
Operational methods (`cascade`, `restrict`, `delete`, `drop`, `preview`, `prune`) use `networkx`, which is always installed as a core dependency.
443+
Operational methods (`cascade`, `restrict`, `preview`, `prune`) use `networkx`, which is always installed as a core dependency.
479444

480445
Diagram **visualization** requires optional dependencies:
481446

@@ -490,7 +455,7 @@ If visualization dependencies are missing, `dj.Diagram` displays a warning and p
490455
## See Also
491456

492457
- [How to Read Diagrams](../../how-to/read-diagrams.ipynb)
493-
- [Delete Data](../../how-to/delete-data.md)Diagram-level delete workflow
458+
- [Delete Data](../../how-to/delete-data.md)Cascade inspection and delete workflow
494459
- [What's New in 2.2](../../explanation/whats-new-22.md) — Motivation and design
495460
- [Data Manipulation](data-manipulation.md) — Insert, update, delete specification
496461
- [Query Algebra](query-algebra.md)

0 commit comments

Comments
 (0)