PytorchConnectomics
diff --git a/‎.claude/reference/NEURD.md‎
Lines changed: 139 additions & 0 deletions b/‎.claude/reference/NEURD.md‎
Lines changed: 139 additions & 0 deletions
diff --git a/‎.claude/reference/seung-lab.md‎
Lines changed: 105 additions & 0 deletions b/‎.claude/reference/seung-lab.md‎
Lines changed: 105 additions & 0 deletions
diff --git a/‎.claude/reference/seung-lab/core-libs/connected-components-3d.md‎
Lines changed: 30 additions & 0 deletions b/‎.claude/reference/seung-lab/core-libs/connected-components-3d.md‎
Lines changed: 30 additions & 0 deletions
diff --git a/‎.claude/reference/seung-lab/core-libs/crackle.md‎
Lines changed: 28 additions & 0 deletions b/‎.claude/reference/seung-lab/core-libs/crackle.md‎
Lines changed: 28 additions & 0 deletions
diff --git a/‎.claude/reference/seung-lab/core-libs/cross-section.md‎
Lines changed: 25 additions & 0 deletions b/‎.claude/reference/seung-lab/core-libs/cross-section.md‎
Lines changed: 25 additions & 0 deletions
diff --git a/‎.claude/reference/seung-lab/core-libs/dijkstra3d.md‎
Lines changed: 30 additions & 0 deletions b/‎.claude/reference/seung-lab/core-libs/dijkstra3d.md‎
Lines changed: 30 additions & 0 deletions
diff --git a/‎.claude/reference/seung-lab/core-libs/euclidean-distance-transform-3d.md‎
Lines changed: 30 additions & 0 deletions b/‎.claude/reference/seung-lab/core-libs/euclidean-distance-transform-3d.md‎
Lines changed: 30 additions & 0 deletions
@@ -0,0 +1,139 @@
+# NEURD Reference
+
+**GitHub:** https://github.com/reimerlab/NEURD
+**Paper:** [Nature 2025](https://www.nature.com/articles/s41586-025-08660-5)
+**Stars:** 22 | **Language:** Python (Jupyter Notebook heavy)
+**Docs:** https://reimerlab.github.io/NEURD/
+
+A mesh decomposition framework for **automated proofreading** and **morphological analysis** of neuronal EM reconstructions. Decomposes neuron meshes into branches/limbs, detects errors via graph-based filters, and produces corrected skeletons with compartment labels.
+
+## Core Idea
+
+Takes a segmented neuron mesh → decomposes into a hierarchical graph (soma → limbs → branches) → applies graph filters to detect merge/split errors → outputs a proofread skeleton with axon/dendrite/soma labels.
+
+Unlike voxel-based approaches (our waterz pipeline), NEURD operates on **meshes** — it processes the output of a segmentation pipeline, not raw affinities. It's a **downstream consumer** of segmentations like ours.
+
+## Architecture
+
+```
+Neuron Mesh (from segmentation)
+  → Soma extraction (mesh clustering)
+  → Limb decomposition (connected components after soma removal)
+  → Branch decomposition (skeleton-guided mesh splitting via CGAL)
+  → Concept network (hierarchical graph: Neuron > Limb > Branch)
+  → Graph filters (error detection on branch graph)
+  → Proofreading (split/merge suggestions)
+  → Compartment labeling (axon/dendrite/soma/AIS)
+  → Morphological features (width, spine density, boutons, etc.)
+```
+
+## Key Modules
+
+| Module | Purpose |
+|--------|---------|
+| `neuron.py` | Core `Neuron` class — hierarchical data structure (soma/limb/branch) |
+| `neuron_pipeline_utils.py` | Full pipeline: mesh → decomposition → proofreading → classification |
+| `preprocess_neuron.py` | Mesh decomposition into limbs and branches |
+| `error_detection.py` | Low-level error detection (double-back, width jumps, skeleton angles) |
+| `graph_error_detector.py` | Graph filter framework for structured error detection |
+| `graph_filters.py` | Specific filter implementations (upstream pair matching, degree checks) |
+| `proofreading_utils.py` | Split/merge suggestion generation and application |
+| `axon_utils.py` | Axon identification and tracing |
+| `apical_utils.py` | Apical dendrite detection |
+| `spine_utils.py` | Dendritic spine detection on mesh branches |
+| `synapse_utils.py` | Synapse association with branches |
+| `cell_type_utils.py` | Excitatory/inhibitory classification |
+| `gnn_cell_typing_utils.py` | GNN-based cell type classification |
+| `connectome_utils.py` | Connectivity matrix construction and analysis |
+| `proximity_utils.py` | Inter-neuron proximity detection |
+| `width_utils.py` | Branch width measurement from mesh |
+| `soma_extraction_utils.py` | Soma mesh extraction |
+| `soma_splitting_utils.py` | Multi-soma neuron splitting |
+| `vdi_default.py` | Volume Data Interface — abstract data access layer |
+| `vdi_microns.py` | MICrONS dataset interface |
+| `vdi_h01.py` | H01 (human cortex) dataset interface |
+
+## Error Detection Approach
+
+NEURD detects errors via **graph filters** on the branch decomposition graph:
+
+### Merge Error Detection
+- **High-degree branching**: Nodes with degree > 3 in skeleton graph → check if branches are compatible (skeleton angle, width continuity, synapse density)
+- **Width jump detection**: Sudden width changes along a path suggest a merge of different neurites
+- **Double-back detection**: Branch that reverses direction suggests it belongs to a different neuron
+- **Axon-on-dendrite**: Thin axonal branch attached to thick dendrite → likely false merge
+
+### Split Error Detection
+- Not the primary focus of NEURD — it assumes the segmentation is over-merged rather than over-split
+- Relies on upstream proofreading tools (e.g., PyChunkedGraph) for split correction
+
+### Filter Pipeline
+```python
+# Pseudocode from graph_error_detector.py
+for each high-degree node in skeleton graph:
+    1. Check distance from soma (skip if too close)
+    2. Filter short endpoints (< min_skeletal_length)
+    3. For each upstream-downstream pair:
+       a. Compare skeleton angles (alignment)
+       b. Compare widths (continuity)
+       c. Compare synapse densities
+       d. Score match quality
+    4. If best match score < threshold → flag as merge error
+    5. Generate split suggestion (which branches to detach)
+```
+
+## Data Structure: Neuron Object
+
+```
+Neuron
+├── soma (mesh, center, radius)
+├── limbs[] (one per connected component after soma removal)
+│   ├── branches[] (skeleton-guided mesh segments)
+│   │   ├── skeleton (3D coordinates)
+│   │   ├── mesh (trimesh object)
+│   │   ├── width_array (per-skeleton-node width)
+│   │   ├── synapses[] (associated synapses)
+│   │   ├── spines[] (detected spines)
+│   │   └── labels (axon/dendrite/etc.)
+│   └── concept_network (networkx graph of branch connectivity)
+└── neuron_graph (full skeleton graph)
+```
+
+## Dependencies
+
+**Core:** numpy, scipy, networkx, trimesh, meshparty, pykdtree, pymeshfix, scikit-learn, matplotlib
+**Data access:** datajoint, datasci-stdlib-tools, neuron_morphology_tools
+**ML (optional):** torch, torch_geometric (for GNN cell typing)
+**Mesh processing:** CGAL (via Docker — C++ mesh segmentation/skeletonization)
+
+## Relevance to PyTC Decoding
+
+NEURD operates **downstream** of our segmentation pipeline — it takes a segmented neuron mesh and proofreads it. Key connections:
+
+### What NEURD does that we don't
+1. **Mesh-based error detection**: Uses 3D mesh geometry (width, angles, surface area) rather than voxel-based affinity
+2. **Structured decomposition**: Soma → limb → branch hierarchy enables local reasoning about errors
+3. **Morphology-aware proofreading**: Width continuity, synapse density, and skeleton angle are strong signals for merge detection
+4. **Multi-soma splitting**: Can detect and split neurons that were falsely merged at the soma level
+
+### Ideas to port to PyTC (voxel-level)
+1. **Width-based merge detection**: NEURD's width-jump filter could be adapted for voxel-based segments — compute skeleton width (via EDT) and flag segments with discontinuous width profiles
+2. **Skeleton angle matching**: For our Stage 3 (skeleton split/re-merge), NEURD's upstream-downstream angle comparison is a proven criterion
+3. **Graph filter framework**: The `graph_error_detector.py` pattern (parameterized filters applied to a neuron graph) could structure our branch_merge stages
+4. **Multi-soma detection**: Before waterz agglomeration, detect soma regions and prevent cross-soma merges (similar to zwatershed's `somaBFS`)
+
+### What we do that NEURD doesn't
+1. **Voxel-level affinity-based segmentation**: NEURD assumes meshes are already available
+2. **Training + inference pipeline**: NEURD is pure post-processing
+3. **Affinity-based merge evidence**: We use raw model predictions; NEURD uses geometry only
+
+## Tutorials
+
+| Tutorial | Description |
+|----------|-------------|
+| Auto Proofreading Pipeline | Full pipeline: mesh → decomposition → proofread |
+| Neuron Features | Hierarchical data access, feature extraction |
+| Proximities | Inter-neuron contact analysis |
+| GNN Cell Typing | Graph neural network cell classification |
+| VDI Override | Custom dataset integration |
+| Spine Detection | Dendritic spine detection on mesh branches |
@@ -0,0 +1,105 @@
+# Seung Lab Repository Index
+
+**GitHub:** https://github.com/seung-lab (186 repos)
+
+Princeton Seung Lab — tools for connectomics: large-scale EM reconstruction, visualization, and analysis. Reference docs for each repo are in `.claude/reference/seung-lab/`.
+
+## Directory Structure
+
+```
+seung-lab/
+├── core-libs/          (11)  Image processing libraries (PyPI packages)
+├── segmentation-pipeline/ (17)  Watershed, agglomeration, meshing, skeletonization
+├── io-formats/         (21)  I/O, compression, cloud storage
+├── visualization/       (8)  Neuroglancer, viewers, rendering
+├── registration/       (13)  Stitching, alignment, elastic registration
+├── deep-learning/      (20)  CNN architectures, training, inference
+├── infrastructure/     (22)  Task queues, pipelines, deployment
+├── julia/              (10)  Julia-language packages
+├── datasets-papers/    (17)  Paper code, connectome datasets
+└── misc/               (48)  Forks, utilities, archived
+```
+
+## Core Libraries (used by PyTC)
+
+| Package | Repo | Stars | PyTC Usage |
+|---------|------|-------|------------|
+| `cc3d` | connected-components-3d | 450 | Connected component labeling in decoding |
+| `edt` | euclidean-distance-transform-3d | 261 | Distance transforms for SDT targets |
+| `kimimaro` | kimimaro | 193 | TEASAR skeletonization for skeleton-aware EDT |
+| `fastremap` | fastremap | 63 | Fast label remapping in branch_merge |
+| `crackle-codec` | crackle | 15 | Transitive dep via kimimaro |
+| `fill-voids` | fill_voids | 29 | Hole filling in morphological ops |
+| `xs3d` | cross-section | 6 | Cross-sectional area computation |
+| `dijkstra3d` | dijkstra3d | 84 | Shortest path (used by kimimaro) |
+
+## Segmentation Pipeline
+
+| Repo | Stars | Description |
+|------|-------|-------------|
+| **abiss** | 6 | Affinity-based instance segmentation (C++ CLI) |
+| **zmesh** | 72 | Marching cubes + mesh simplification |
+| **watershed** | 6 | C++ watershed on affinity graphs |
+| **segascorus** | 6 | Rand/VOI segmentation error metrics |
+| **pcg_skel** | 0 | ChunkedGraph skeletonization |
+| **Synaptor** | 0 | Synapse detection pipeline |
+| **MMAAPP** | 2 | Mean affinity agglomeration |
+
+## I/O & Cloud
+
+| Repo | Stars | Description |
+|------|-------|-------------|
+| **cloud-volume** | 170 | Read/write Neuroglancer Precomputed volumes |
+| **cloud-files** | 44 | Threaded GCS/S3/local file client |
+| **fpzip** | 36 | Floating-point compression |
+| **compresso** | 4 | Segmentation compression (600-2200x) |
+| **DracoPy** | 117 | Google Draco mesh compression |
+| **tinybrain** | 11 | Image pyramid generation |
+| **mapbuffer** | 10 | Fast serialized int-to-bytes dict |
+
+## Visualization
+
+| Repo | Stars | Description |
+|------|-------|-------------|
+| **neuroglancer** | 24 | WebGL volumetric data viewer (Seung fork) |
+| **microviewer** | 16 | Browser-based 3D numpy viewer |
+| **NeuroBlender** | 8 | Blender neuron visualization |
+
+## Registration & Alignment
+
+| Repo | Stars | Description |
+|------|-------|-------------|
+| **SEAMLeSS** | 9 | ML-based EM section alignment |
+| **corgie** | 16 | Petascale volume registration CLI |
+| **metroem** | 9 | EM alignment model training |
+| **feabas** | 0 | Finite-element EM stitching |
+| **Alembic** | 10 | Julia elastic registration |
+
+## Deep Learning
+
+| Repo | Stars | Description |
+|------|-------|-------------|
+| **znn-release** | 94 | Multi-core 3D ConvNet (historical, archived) |
+| **NCCNet** | 40 | Normalized cross-correlation template matching |
+| **DeepEM** | 16 | Deep learning for EM connectomics |
+| **chunkflow** | 55 | Distributed petabyte-scale processing |
+| **torchfields** | 51 | PyTorch displacement field / spatial transformers |
+
+## Infrastructure
+
+| Repo | Stars | Description |
+|------|-------|-------------|
+| **igneous** | 66 | Scalable downsampling, meshing, skeletonizing |
+| **python-task-queue** | 39 | SQS/filesystem async task queue |
+| **seuron** | 7 | Distributed neuron reconstruction pipeline |
+| **CAVEpipelines** | 5 | ChunkedGraph/meshing/L2cache deployment |
+
+## Datasets & Papers
+
+| Repo | Stars | Description |
+|------|-------|-------------|
+| **FlyConnectome** | 17 | FlyWire connectome data access |
+| **FlyWirePaper** | 3 | FlyWire paper figure reproduction |
+| **MicronsBinder** | 3 | MICrONS dataset notebooks |
+| **zebrafish** | 1 | Zebrafish hindbrain connectome |
+| **e2198-gc-analysis** | 8 | Retinal ganglion cell connectomics |
@@ -0,0 +1,30 @@
+# connected-components-3d
+
+**GitHub:** https://github.com/seung-lab/connected-components-3d
+**Language:** C++ | **Stars:** 450
+
+Fast connected components labeling on multilabel 2D and 3D images. Supports 4/8-connected (2D) and 6/18/26-connected (3D) neighborhoods, continuous value CCL, and periodic boundaries. Uses Union-Find with decision trees.
+
+## Key Features
+- Single-pass multilabel CCL (no per-label masking needed)
+- Continuous value CCL for grayscale images (delta-based grouping)
+- Statistics: centroids, bounding boxes, voxel counts
+- Dust removal (small/large object filtering), k-largest extraction
+- Contact surface area and contact network computation
+- Per-voxel connectivity graph extraction
+- Periodic boundary support for simulations
+
+## API
+```python
+import cc3d
+import numpy as np
+
+labels_in = np.ones((512, 512, 512), dtype=np.int32)
+labels_out = cc3d.connected_components(labels_in, connectivity=26)
+labels_out = cc3d.dust(labels_out, threshold=100, connectivity=26)
+labels_out = cc3d.largest_k(labels_out, k=10)
+stats = cc3d.statistics(labels_out)  # centroids, bboxes, voxel_counts
+```
+
+## Relevance to Connectomics
+Core dependency of PyTC (`cc3d` in requirements). Used in segmentation post-processing to split disconnected components, remove dust, and compute instance statistics after watershed/agglomeration.
@@ -0,0 +1,28 @@
+# crackle
+
+**GitHub:** https://github.com/seung-lab/crackle
+**Language:** C++ | **Stars:** 15
+
+Next-generation 3D segmentation compression codec based on crack codes. Provides high compression ratios for dense label volumes with fast random access and label queries without full decompression.
+
+## Key Features
+- Compress/decompress 2D and 3D dense segmentation arrays
+- Extract binary images, labels, voxel counts, centroids, bounding boxes without decompressing
+- Array slicing via CrackleArray with read/write support
+- Connected components, contact surface analysis, voxel connectivity graph
+- Remap, refit, and renumber labels in compressed form
+- CLI tool for file conversion and integrity checking
+
+## API
+```python
+import crackle
+binary = crackle.compress(labels, allow_pins=False, markov_model_order=0)
+labels = crackle.decompress(binary)
+uniq = crackle.labels(binary)
+arr = crackle.CrackleArray(binary)
+res = arr[:10,:10,:10]
+crackle.save(labels, "output.ckl")
+```
+
+## Relevance to Connectomics
+Primary compression format for large-scale EM segmentation volumes; used as a dependency by kimimaro, fastmorph, and cloud-volume. Listed as a core dependency of PyTorch Connectomics.
@@ -0,0 +1,25 @@
+# cross-section
+
+**GitHub:** https://github.com/seung-lab/cross-section
+**Language:** C++ | **Stars:** 6
+
+Compute cross-sectional area and arbitrary 2D slice projections of 3D volumetric image objects. Published as the `xs3d` PyPI package.
+
+## Key Features
+- Cross-sectional area measurement at any point/orientation in a 3D binary image
+- Arbitrary-angle 2D slicing of 3D volumes
+- Anisotropy-aware with physical unit support
+- Edge contact detection for underestimate warnings
+- Per-voxel area contribution maps
+
+## API
+```python
+import xs3d
+area = xs3d.cross_sectional_area(binary_image, vertex, normal, resolution)
+area, contact = xs3d.cross_sectional_area(binary_image, vertex, normal, resolution, return_contact=True)
+image2d = xs3d.slice(labels, vertex, normal, anisotropy)
+section_map = xs3d.cross_section(binary_image, vertex, normal, resolution)
+```
+
+## Relevance to Connectomics
+Measures neurite caliber (cross-sectional area) along skeletons for compartment simulations and morphological analysis.
@@ -0,0 +1,30 @@
+# dijkstra3d
+
+**GitHub:** https://github.com/seung-lab/dijkstra3d
+**Language:** C++ | **Stars:** 84
+
+Dijkstra's shortest path variants for 6, 18, and 26-connected 3D image volumes (and 4/8-connected 2D). Designed for voxel-based pathfinding without explicit graph construction.
+
+## Key Features
+- Dijkstra, bidirectional Dijkstra, and A* (compass) search on 3D volumes
+- Binary Dijkstra for foreground/background images
+- Euclidean distance field computation with anisotropy support
+- Parental field for efficient multi-target path extraction
+- Voxel connectivity graph support for custom traversal constraints
+- No explicit graph construction needed (implicit edges from image grid)
+
+## API
+```python
+import dijkstra3d
+import numpy as np
+
+field = np.ones((512, 512, 512), dtype=np.int32)
+path = dijkstra3d.dijkstra(field, source=(0,0,0), target=(511,511,511), connectivity=26)
+path = dijkstra3d.binary_dijkstra(field, source, target, background_color=0)
+dist = dijkstra3d.euclidean_distance_field(field, source=(0,0,0), anisotropy=(4,4,40))
+parents = dijkstra3d.parental_field(field, source=(0,0,0))
+path = dijkstra3d.path_from_parents(parents, target=(511,511,511))
+```
+
+## Relevance to Connectomics
+Core dependency of kimimaro (TEASAR skeletonization); provides shortest-path computation through 3D segmentation volumes for skeleton extraction and distance-based analysis.
@@ -0,0 +1,30 @@
+# euclidean-distance-transform-3d
+
+**GitHub:** https://github.com/seung-lab/euclidean-distance-transform-3d
+**Language:** C++ | **Stars:** 261
+
+Multi-label anisotropic 3D Euclidean distance transform (MLAEDT-3D) using marching parabolas. Computes EDT and signed distance fields for 1D/2D/3D labeled images with support for anisotropic voxel spacing and parallel execution.
+
+## Key Features
+- Single-pass multi-label distance transform (no per-label masking needed)
+- Anisotropic voxel spacing support (critical for EM data)
+- Signed distance function (SDF) computation
+- Parallel multi-threaded execution
+- Per-label iteration via `edt.each()`
+- Voxel connectivity graph for self-touching labels
+
+## API
+```python
+import edt
+import numpy as np
+
+labels = np.ones((512, 512, 512), dtype=np.uint32, order='F')
+dt = edt.edt(labels, anisotropy=(6, 6, 30), black_border=True, parallel=4)
+sdf = edt.sdf(labels, anisotropy=(6, 6, 30))
+
+for label, image in edt.each(labels, dt, in_place=True):
+    process(image)
+```
+
+## Relevance to Connectomics
+Computes distance transforms for EM segmentation post-processing, used in TEASAR skeletonization (kimimaro) and boundary-based loss functions. PyTC uses distance transforms in `connectomics/data/process/distance.py`.