Skip to content

Commit a055edf

Browse files
Donglai Weiclaude
andcommitted
Add seung-lab reference docs (186 repos), NEURD reference, update waterz README
- Document all 186 seung-lab GitHub repos in .claude/reference/seung-lab/ organized into 10 categories: core-libs, segmentation-pipeline, io-formats, visualization, registration, deep-learning, infrastructure, julia, datasets-papers, misc - Add seung-lab.md index with categorized summary table - Add NEURD.md: mesh decomposition framework for automated proofreading (reimerlab/NEURD, Nature 2025) - Update lib/waterz/README.md with full API documentation: agglomerate, waterz, get_region_graph, merge_segments, merge_dust, merge_id, evaluate Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 5a0b06f commit a055edf

189 files changed

Lines changed: 2886 additions & 0 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.claude/reference/NEURD.md

Lines changed: 139 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,139 @@
1+
# NEURD Reference
2+
3+
**GitHub:** https://github.com/reimerlab/NEURD
4+
**Paper:** [Nature 2025](https://www.nature.com/articles/s41586-025-08660-5)
5+
**Stars:** 22 | **Language:** Python (Jupyter Notebook heavy)
6+
**Docs:** https://reimerlab.github.io/NEURD/
7+
8+
A mesh decomposition framework for **automated proofreading** and **morphological analysis** of neuronal EM reconstructions. Decomposes neuron meshes into branches/limbs, detects errors via graph-based filters, and produces corrected skeletons with compartment labels.
9+
10+
## Core Idea
11+
12+
Takes a segmented neuron mesh → decomposes into a hierarchical graph (soma → limbs → branches) → applies graph filters to detect merge/split errors → outputs a proofread skeleton with axon/dendrite/soma labels.
13+
14+
Unlike voxel-based approaches (our waterz pipeline), NEURD operates on **meshes** — it processes the output of a segmentation pipeline, not raw affinities. It's a **downstream consumer** of segmentations like ours.
15+
16+
## Architecture
17+
18+
```
19+
Neuron Mesh (from segmentation)
20+
→ Soma extraction (mesh clustering)
21+
→ Limb decomposition (connected components after soma removal)
22+
→ Branch decomposition (skeleton-guided mesh splitting via CGAL)
23+
→ Concept network (hierarchical graph: Neuron > Limb > Branch)
24+
→ Graph filters (error detection on branch graph)
25+
→ Proofreading (split/merge suggestions)
26+
→ Compartment labeling (axon/dendrite/soma/AIS)
27+
→ Morphological features (width, spine density, boutons, etc.)
28+
```
29+
30+
## Key Modules
31+
32+
| Module | Purpose |
33+
|--------|---------|
34+
| `neuron.py` | Core `Neuron` class — hierarchical data structure (soma/limb/branch) |
35+
| `neuron_pipeline_utils.py` | Full pipeline: mesh → decomposition → proofreading → classification |
36+
| `preprocess_neuron.py` | Mesh decomposition into limbs and branches |
37+
| `error_detection.py` | Low-level error detection (double-back, width jumps, skeleton angles) |
38+
| `graph_error_detector.py` | Graph filter framework for structured error detection |
39+
| `graph_filters.py` | Specific filter implementations (upstream pair matching, degree checks) |
40+
| `proofreading_utils.py` | Split/merge suggestion generation and application |
41+
| `axon_utils.py` | Axon identification and tracing |
42+
| `apical_utils.py` | Apical dendrite detection |
43+
| `spine_utils.py` | Dendritic spine detection on mesh branches |
44+
| `synapse_utils.py` | Synapse association with branches |
45+
| `cell_type_utils.py` | Excitatory/inhibitory classification |
46+
| `gnn_cell_typing_utils.py` | GNN-based cell type classification |
47+
| `connectome_utils.py` | Connectivity matrix construction and analysis |
48+
| `proximity_utils.py` | Inter-neuron proximity detection |
49+
| `width_utils.py` | Branch width measurement from mesh |
50+
| `soma_extraction_utils.py` | Soma mesh extraction |
51+
| `soma_splitting_utils.py` | Multi-soma neuron splitting |
52+
| `vdi_default.py` | Volume Data Interface — abstract data access layer |
53+
| `vdi_microns.py` | MICrONS dataset interface |
54+
| `vdi_h01.py` | H01 (human cortex) dataset interface |
55+
56+
## Error Detection Approach
57+
58+
NEURD detects errors via **graph filters** on the branch decomposition graph:
59+
60+
### Merge Error Detection
61+
- **High-degree branching**: Nodes with degree > 3 in skeleton graph → check if branches are compatible (skeleton angle, width continuity, synapse density)
62+
- **Width jump detection**: Sudden width changes along a path suggest a merge of different neurites
63+
- **Double-back detection**: Branch that reverses direction suggests it belongs to a different neuron
64+
- **Axon-on-dendrite**: Thin axonal branch attached to thick dendrite → likely false merge
65+
66+
### Split Error Detection
67+
- Not the primary focus of NEURD — it assumes the segmentation is over-merged rather than over-split
68+
- Relies on upstream proofreading tools (e.g., PyChunkedGraph) for split correction
69+
70+
### Filter Pipeline
71+
```python
72+
# Pseudocode from graph_error_detector.py
73+
for each high-degree node in skeleton graph:
74+
1. Check distance from soma (skip if too close)
75+
2. Filter short endpoints (< min_skeletal_length)
76+
3. For each upstream-downstream pair:
77+
a. Compare skeleton angles (alignment)
78+
b. Compare widths (continuity)
79+
c. Compare synapse densities
80+
d. Score match quality
81+
4. If best match score < threshold → flag as merge error
82+
5. Generate split suggestion (which branches to detach)
83+
```
84+
85+
## Data Structure: Neuron Object
86+
87+
```
88+
Neuron
89+
├── soma (mesh, center, radius)
90+
├── limbs[] (one per connected component after soma removal)
91+
│ ├── branches[] (skeleton-guided mesh segments)
92+
│ │ ├── skeleton (3D coordinates)
93+
│ │ ├── mesh (trimesh object)
94+
│ │ ├── width_array (per-skeleton-node width)
95+
│ │ ├── synapses[] (associated synapses)
96+
│ │ ├── spines[] (detected spines)
97+
│ │ └── labels (axon/dendrite/etc.)
98+
│ └── concept_network (networkx graph of branch connectivity)
99+
└── neuron_graph (full skeleton graph)
100+
```
101+
102+
## Dependencies
103+
104+
**Core:** numpy, scipy, networkx, trimesh, meshparty, pykdtree, pymeshfix, scikit-learn, matplotlib
105+
**Data access:** datajoint, datasci-stdlib-tools, neuron_morphology_tools
106+
**ML (optional):** torch, torch_geometric (for GNN cell typing)
107+
**Mesh processing:** CGAL (via Docker — C++ mesh segmentation/skeletonization)
108+
109+
## Relevance to PyTC Decoding
110+
111+
NEURD operates **downstream** of our segmentation pipeline — it takes a segmented neuron mesh and proofreads it. Key connections:
112+
113+
### What NEURD does that we don't
114+
1. **Mesh-based error detection**: Uses 3D mesh geometry (width, angles, surface area) rather than voxel-based affinity
115+
2. **Structured decomposition**: Soma → limb → branch hierarchy enables local reasoning about errors
116+
3. **Morphology-aware proofreading**: Width continuity, synapse density, and skeleton angle are strong signals for merge detection
117+
4. **Multi-soma splitting**: Can detect and split neurons that were falsely merged at the soma level
118+
119+
### Ideas to port to PyTC (voxel-level)
120+
1. **Width-based merge detection**: NEURD's width-jump filter could be adapted for voxel-based segments — compute skeleton width (via EDT) and flag segments with discontinuous width profiles
121+
2. **Skeleton angle matching**: For our Stage 3 (skeleton split/re-merge), NEURD's upstream-downstream angle comparison is a proven criterion
122+
3. **Graph filter framework**: The `graph_error_detector.py` pattern (parameterized filters applied to a neuron graph) could structure our branch_merge stages
123+
4. **Multi-soma detection**: Before waterz agglomeration, detect soma regions and prevent cross-soma merges (similar to zwatershed's `somaBFS`)
124+
125+
### What we do that NEURD doesn't
126+
1. **Voxel-level affinity-based segmentation**: NEURD assumes meshes are already available
127+
2. **Training + inference pipeline**: NEURD is pure post-processing
128+
3. **Affinity-based merge evidence**: We use raw model predictions; NEURD uses geometry only
129+
130+
## Tutorials
131+
132+
| Tutorial | Description |
133+
|----------|-------------|
134+
| Auto Proofreading Pipeline | Full pipeline: mesh → decomposition → proofread |
135+
| Neuron Features | Hierarchical data access, feature extraction |
136+
| Proximities | Inter-neuron contact analysis |
137+
| GNN Cell Typing | Graph neural network cell classification |
138+
| VDI Override | Custom dataset integration |
139+
| Spine Detection | Dendritic spine detection on mesh branches |

.claude/reference/seung-lab.md

Lines changed: 105 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,105 @@
1+
# Seung Lab Repository Index
2+
3+
**GitHub:** https://github.com/seung-lab (186 repos)
4+
5+
Princeton Seung Lab — tools for connectomics: large-scale EM reconstruction, visualization, and analysis. Reference docs for each repo are in `.claude/reference/seung-lab/`.
6+
7+
## Directory Structure
8+
9+
```
10+
seung-lab/
11+
├── core-libs/ (11) Image processing libraries (PyPI packages)
12+
├── segmentation-pipeline/ (17) Watershed, agglomeration, meshing, skeletonization
13+
├── io-formats/ (21) I/O, compression, cloud storage
14+
├── visualization/ (8) Neuroglancer, viewers, rendering
15+
├── registration/ (13) Stitching, alignment, elastic registration
16+
├── deep-learning/ (20) CNN architectures, training, inference
17+
├── infrastructure/ (22) Task queues, pipelines, deployment
18+
├── julia/ (10) Julia-language packages
19+
├── datasets-papers/ (17) Paper code, connectome datasets
20+
└── misc/ (48) Forks, utilities, archived
21+
```
22+
23+
## Core Libraries (used by PyTC)
24+
25+
| Package | Repo | Stars | PyTC Usage |
26+
|---------|------|-------|------------|
27+
| `cc3d` | connected-components-3d | 450 | Connected component labeling in decoding |
28+
| `edt` | euclidean-distance-transform-3d | 261 | Distance transforms for SDT targets |
29+
| `kimimaro` | kimimaro | 193 | TEASAR skeletonization for skeleton-aware EDT |
30+
| `fastremap` | fastremap | 63 | Fast label remapping in branch_merge |
31+
| `crackle-codec` | crackle | 15 | Transitive dep via kimimaro |
32+
| `fill-voids` | fill_voids | 29 | Hole filling in morphological ops |
33+
| `xs3d` | cross-section | 6 | Cross-sectional area computation |
34+
| `dijkstra3d` | dijkstra3d | 84 | Shortest path (used by kimimaro) |
35+
36+
## Segmentation Pipeline
37+
38+
| Repo | Stars | Description |
39+
|------|-------|-------------|
40+
| **abiss** | 6 | Affinity-based instance segmentation (C++ CLI) |
41+
| **zmesh** | 72 | Marching cubes + mesh simplification |
42+
| **watershed** | 6 | C++ watershed on affinity graphs |
43+
| **segascorus** | 6 | Rand/VOI segmentation error metrics |
44+
| **pcg_skel** | 0 | ChunkedGraph skeletonization |
45+
| **Synaptor** | 0 | Synapse detection pipeline |
46+
| **MMAAPP** | 2 | Mean affinity agglomeration |
47+
48+
## I/O & Cloud
49+
50+
| Repo | Stars | Description |
51+
|------|-------|-------------|
52+
| **cloud-volume** | 170 | Read/write Neuroglancer Precomputed volumes |
53+
| **cloud-files** | 44 | Threaded GCS/S3/local file client |
54+
| **fpzip** | 36 | Floating-point compression |
55+
| **compresso** | 4 | Segmentation compression (600-2200x) |
56+
| **DracoPy** | 117 | Google Draco mesh compression |
57+
| **tinybrain** | 11 | Image pyramid generation |
58+
| **mapbuffer** | 10 | Fast serialized int-to-bytes dict |
59+
60+
## Visualization
61+
62+
| Repo | Stars | Description |
63+
|------|-------|-------------|
64+
| **neuroglancer** | 24 | WebGL volumetric data viewer (Seung fork) |
65+
| **microviewer** | 16 | Browser-based 3D numpy viewer |
66+
| **NeuroBlender** | 8 | Blender neuron visualization |
67+
68+
## Registration & Alignment
69+
70+
| Repo | Stars | Description |
71+
|------|-------|-------------|
72+
| **SEAMLeSS** | 9 | ML-based EM section alignment |
73+
| **corgie** | 16 | Petascale volume registration CLI |
74+
| **metroem** | 9 | EM alignment model training |
75+
| **feabas** | 0 | Finite-element EM stitching |
76+
| **Alembic** | 10 | Julia elastic registration |
77+
78+
## Deep Learning
79+
80+
| Repo | Stars | Description |
81+
|------|-------|-------------|
82+
| **znn-release** | 94 | Multi-core 3D ConvNet (historical, archived) |
83+
| **NCCNet** | 40 | Normalized cross-correlation template matching |
84+
| **DeepEM** | 16 | Deep learning for EM connectomics |
85+
| **chunkflow** | 55 | Distributed petabyte-scale processing |
86+
| **torchfields** | 51 | PyTorch displacement field / spatial transformers |
87+
88+
## Infrastructure
89+
90+
| Repo | Stars | Description |
91+
|------|-------|-------------|
92+
| **igneous** | 66 | Scalable downsampling, meshing, skeletonizing |
93+
| **python-task-queue** | 39 | SQS/filesystem async task queue |
94+
| **seuron** | 7 | Distributed neuron reconstruction pipeline |
95+
| **CAVEpipelines** | 5 | ChunkedGraph/meshing/L2cache deployment |
96+
97+
## Datasets & Papers
98+
99+
| Repo | Stars | Description |
100+
|------|-------|-------------|
101+
| **FlyConnectome** | 17 | FlyWire connectome data access |
102+
| **FlyWirePaper** | 3 | FlyWire paper figure reproduction |
103+
| **MicronsBinder** | 3 | MICrONS dataset notebooks |
104+
| **zebrafish** | 1 | Zebrafish hindbrain connectome |
105+
| **e2198-gc-analysis** | 8 | Retinal ganglion cell connectomics |
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
# connected-components-3d
2+
3+
**GitHub:** https://github.com/seung-lab/connected-components-3d
4+
**Language:** C++ | **Stars:** 450
5+
6+
Fast connected components labeling on multilabel 2D and 3D images. Supports 4/8-connected (2D) and 6/18/26-connected (3D) neighborhoods, continuous value CCL, and periodic boundaries. Uses Union-Find with decision trees.
7+
8+
## Key Features
9+
- Single-pass multilabel CCL (no per-label masking needed)
10+
- Continuous value CCL for grayscale images (delta-based grouping)
11+
- Statistics: centroids, bounding boxes, voxel counts
12+
- Dust removal (small/large object filtering), k-largest extraction
13+
- Contact surface area and contact network computation
14+
- Per-voxel connectivity graph extraction
15+
- Periodic boundary support for simulations
16+
17+
## API
18+
```python
19+
import cc3d
20+
import numpy as np
21+
22+
labels_in = np.ones((512, 512, 512), dtype=np.int32)
23+
labels_out = cc3d.connected_components(labels_in, connectivity=26)
24+
labels_out = cc3d.dust(labels_out, threshold=100, connectivity=26)
25+
labels_out = cc3d.largest_k(labels_out, k=10)
26+
stats = cc3d.statistics(labels_out) # centroids, bboxes, voxel_counts
27+
```
28+
29+
## Relevance to Connectomics
30+
Core dependency of PyTC (`cc3d` in requirements). Used in segmentation post-processing to split disconnected components, remove dust, and compute instance statistics after watershed/agglomeration.
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
# crackle
2+
3+
**GitHub:** https://github.com/seung-lab/crackle
4+
**Language:** C++ | **Stars:** 15
5+
6+
Next-generation 3D segmentation compression codec based on crack codes. Provides high compression ratios for dense label volumes with fast random access and label queries without full decompression.
7+
8+
## Key Features
9+
- Compress/decompress 2D and 3D dense segmentation arrays
10+
- Extract binary images, labels, voxel counts, centroids, bounding boxes without decompressing
11+
- Array slicing via CrackleArray with read/write support
12+
- Connected components, contact surface analysis, voxel connectivity graph
13+
- Remap, refit, and renumber labels in compressed form
14+
- CLI tool for file conversion and integrity checking
15+
16+
## API
17+
```python
18+
import crackle
19+
binary = crackle.compress(labels, allow_pins=False, markov_model_order=0)
20+
labels = crackle.decompress(binary)
21+
uniq = crackle.labels(binary)
22+
arr = crackle.CrackleArray(binary)
23+
res = arr[:10,:10,:10]
24+
crackle.save(labels, "output.ckl")
25+
```
26+
27+
## Relevance to Connectomics
28+
Primary compression format for large-scale EM segmentation volumes; used as a dependency by kimimaro, fastmorph, and cloud-volume. Listed as a core dependency of PyTorch Connectomics.
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
# cross-section
2+
3+
**GitHub:** https://github.com/seung-lab/cross-section
4+
**Language:** C++ | **Stars:** 6
5+
6+
Compute cross-sectional area and arbitrary 2D slice projections of 3D volumetric image objects. Published as the `xs3d` PyPI package.
7+
8+
## Key Features
9+
- Cross-sectional area measurement at any point/orientation in a 3D binary image
10+
- Arbitrary-angle 2D slicing of 3D volumes
11+
- Anisotropy-aware with physical unit support
12+
- Edge contact detection for underestimate warnings
13+
- Per-voxel area contribution maps
14+
15+
## API
16+
```python
17+
import xs3d
18+
area = xs3d.cross_sectional_area(binary_image, vertex, normal, resolution)
19+
area, contact = xs3d.cross_sectional_area(binary_image, vertex, normal, resolution, return_contact=True)
20+
image2d = xs3d.slice(labels, vertex, normal, anisotropy)
21+
section_map = xs3d.cross_section(binary_image, vertex, normal, resolution)
22+
```
23+
24+
## Relevance to Connectomics
25+
Measures neurite caliber (cross-sectional area) along skeletons for compartment simulations and morphological analysis.
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
# dijkstra3d
2+
3+
**GitHub:** https://github.com/seung-lab/dijkstra3d
4+
**Language:** C++ | **Stars:** 84
5+
6+
Dijkstra's shortest path variants for 6, 18, and 26-connected 3D image volumes (and 4/8-connected 2D). Designed for voxel-based pathfinding without explicit graph construction.
7+
8+
## Key Features
9+
- Dijkstra, bidirectional Dijkstra, and A* (compass) search on 3D volumes
10+
- Binary Dijkstra for foreground/background images
11+
- Euclidean distance field computation with anisotropy support
12+
- Parental field for efficient multi-target path extraction
13+
- Voxel connectivity graph support for custom traversal constraints
14+
- No explicit graph construction needed (implicit edges from image grid)
15+
16+
## API
17+
```python
18+
import dijkstra3d
19+
import numpy as np
20+
21+
field = np.ones((512, 512, 512), dtype=np.int32)
22+
path = dijkstra3d.dijkstra(field, source=(0,0,0), target=(511,511,511), connectivity=26)
23+
path = dijkstra3d.binary_dijkstra(field, source, target, background_color=0)
24+
dist = dijkstra3d.euclidean_distance_field(field, source=(0,0,0), anisotropy=(4,4,40))
25+
parents = dijkstra3d.parental_field(field, source=(0,0,0))
26+
path = dijkstra3d.path_from_parents(parents, target=(511,511,511))
27+
```
28+
29+
## Relevance to Connectomics
30+
Core dependency of kimimaro (TEASAR skeletonization); provides shortest-path computation through 3D segmentation volumes for skeleton extraction and distance-based analysis.
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
# euclidean-distance-transform-3d
2+
3+
**GitHub:** https://github.com/seung-lab/euclidean-distance-transform-3d
4+
**Language:** C++ | **Stars:** 261
5+
6+
Multi-label anisotropic 3D Euclidean distance transform (MLAEDT-3D) using marching parabolas. Computes EDT and signed distance fields for 1D/2D/3D labeled images with support for anisotropic voxel spacing and parallel execution.
7+
8+
## Key Features
9+
- Single-pass multi-label distance transform (no per-label masking needed)
10+
- Anisotropic voxel spacing support (critical for EM data)
11+
- Signed distance function (SDF) computation
12+
- Parallel multi-threaded execution
13+
- Per-label iteration via `edt.each()`
14+
- Voxel connectivity graph for self-touching labels
15+
16+
## API
17+
```python
18+
import edt
19+
import numpy as np
20+
21+
labels = np.ones((512, 512, 512), dtype=np.uint32, order='F')
22+
dt = edt.edt(labels, anisotropy=(6, 6, 30), black_border=True, parallel=4)
23+
sdf = edt.sdf(labels, anisotropy=(6, 6, 30))
24+
25+
for label, image in edt.each(labels, dt, in_place=True):
26+
process(image)
27+
```
28+
29+
## Relevance to Connectomics
30+
Computes distance transforms for EM segmentation post-processing, used in TEASAR skeletonization (kimimaro) and boundary-based loss functions. PyTC uses distance transforms in `connectomics/data/process/distance.py`.

0 commit comments

Comments
 (0)