feat: seg contact training, inference, and data pipeline#1216
Open
mgschm wants to merge 65 commits into
Open
Conversation
Add try/except import in architecture/__init__.py so pointcloud network registrations are loaded when zetta_utils modules are loaded. The import is wrapped in try/except to handle environments without the pointnet package installed. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add resample_points() with distance-weighted sampling (uniform, inverse_r, inverse_r2) - Add resample_pointclouds() for per-segment and contact-face resampling - Add resample_combined_pointcloud() for combined pointcloud resampling - Add deduplicate_pointclouds() to remove duplicate points - Add apply_random_flip() for random axis flipping augmentation - Add randomize_segment_identity() to swap seg_a/seg_b labels
- Add contact_faces_original_nm field to preserve pre-normalization coordinates - Change segment labels from 0/1 to -1/+1 encoding - Add optional 5th affinity channel (per_point or mean mode) - Add channel masking support with global/local/random modes - Add affinity noise augmentation - Include info_path in output for visualization support
- Add format_version attribute to SegContactLayerBackend (default 1.1) - Load format_version from info file (defaults to 1.0 for old data) - Conditionally read representative_points only for format >= 1.1 - Make representative_points optional (None for format 1.0) - Add explicit error checks for missing representative_points: - Reading format >= 1.1: fail if data missing - Writing: fail if representative_points is None - Handle None representative_points in randomize_segment_identity - Use packaging.version.Version for proper version comparison - Update test to expect format_version 1.1 for new backends
Previously the write path always required representative_points, even when writing format 1.0 data. Now properly checks format_version before writing representative_points section.
- Add format_version field (default 1.1 for new layers) - Inherit format_version from source in from_reference method - Use params.format_version in make_info instead of hardcoded value
…version Extract pointcloud-to-tensor conversion into shared module to ensure consistent segment labeling between training and inference. Refactor SegContactDataset to use new utilities.
New operation to run PointNet contact merge inference and write merge_probabilities to seg_contact layer.
Script to compute aggregate AUC-PR/AUC-ROC metrics across all chunks by reading written merge_probabilities and comparing to ground truth.
…ntact data - Make contact_faces optional on SegContact (default None) - Skip _write_contacts_chunk when all contact_faces are None - Simplify ContactMergeOp to build minimal SegContact with only id, seg_a, seg_b, com, and merge_probabilities
…ination - Add required --source-path flag for contacts and GT data - Use os.path.join for trailing-slash safety in GCS paths - Switch mean affinity to pytorch nonzero-mask computation - Filter contacts by COM within bbox (matching backend behavior) - Generate chunk keys from info file grid instead of --chunk-size - Add early validation that prediction files exist at expected path
…laimed flag), max_unclaimed_vx/fraction, min_interface_gt_fraction filters
…, and chunk exclusion - Interactive histogram dashboard for seg contact filter stats with threshold sliders, GT-stacked bars, and neuroglancer popup links - Annotation system: per-contact correct/wrong/unclear labels with notes, localStorage persistence, import/export, annotation-colored bar mode with 4 categories (correct/wrong/unclear/unannotated) - Per-GT category sample tables with deterministic ordering (seeded PRNG), pagination, and weighted annotation summaries - GT segment refs (gt_refs_a/b) written to parquet and selected in neuroglancer Ground Truth layer links - Chunk exclusion filter: text input with range support + ctrl+click toggle - Tests for segment metrics and contact filter stats helpers
…lows Allows processing a random subset of chunks (seeded for reproducibility) instead of all chunks, useful for sampling-based test runs or stats collection.
Add optional nucleus_layer param to SegContactOp. In filter stats mode, reads nucleus segmentation at finest available resolution and records has_nucleus flag per chunk in parquet. Production mode is unaffected — nucleus filtering deferred to training pipeline.
Store nucleus segmentation path in dataset info for neuroglancer links.
Dashboard changes: nucleus checkbox in chunk panel to exclude chunks with nuclei, chunk bar dimming respects unfiltered mode, click-to-highlight on chunk bars, and nucleus segmentation added as hidden layer in ngl links.
…s, mesh LRU cache, np.isin dtype fix - Make reference_layer optional in SegContactOp (no GT needed for production) - Add constraint_layers dict for filtering contacts by dominant segment labels - Add MeshLRUCache with memory budget using cachetools.LRUCache - Sort pairs by segment frequency to maximize cache hits - Parallelize mesh downloads with ThreadPoolExecutor (num_procs) - Add mesh_lod parameter for downloading lower-resolution meshes - Add early affinity filter before expensive boundary/COM filters - Vectorize pair-level filters using pandas reindex - Fix np.isin dtype bug that falsely blacklisted ~15% of segments with large uint64 IDs - Add model.eval() for ckpt+json model loading (fixes batchnorm in eval mode) - Add csv_output_path to ContactMergeOp for per-chunk score export - Add skip-if-exists check in SegContactOp - Add NaN-to-zero clamping for model output - Handle MeshMissingError in mesh downloads Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ing, remove zero-padding assumptions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…p_chunks_with_nucleus option, MeshMissingError compat Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…_memory, precomputed mean_affinity, ValCheckIntervalGuard Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…thority filtering, PR curve axis controls Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…odel Aligns with the criteria/loss_weights refactor in ContactMergeRegime: each criterion key (here "merge") maps to a batch key. Targets are now [B,1] to match the PointNet model output shape, avoiding ambiguous broadcasting in the loss computation. Bumps internal submodule. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add dominant_ref_only mode to _compute_seg_to_ref_by_segment that keeps only the highest-overlap reference per segment, avoiding false-positive merges from sliver overlaps at misaligned supervoxel boundaries. SegContactOp now writes both gt_merge_label_set and gt_merge_label_dominant columns (plus gt_dominant_ref_a/b) so downstream analysis can compare criteria without re-running. Visualization gains a set-vs-dominant toggle and NG links switch between full ref sets and dominant refs accordingly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Read-proc that zeros out segmentation voxels whose IDs are not in a neuroglancer_segment_properties allowlist, enabling on-the-fly proofreading masks. Properties file is fetched once per worker via lru_cache. Includes dtype-safe ID comparison with overflow detection. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Previously affinity noise was added to all points including segment points (which have 0 in the affinity channel by construction), injecting spurious non-zero values. Now mask noise to CF points only via the segment label channel (!=+-1). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Runs zetta run for all (dataset, variant, agg_level) combinations with configurable parallelism. Supports per-dataset variant/agg lists, temp cue files to avoid edit races, and per-job log files. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add tests for tensor_utils, procs (resample, dedup, flip, identity swap, normalize with use_pointcloud_radius), and dataset helpers (pad/truncate, channel masking, affinity filtering). Fix 9 pre-existing broken tests in test_seg_contact_dataset caused by target->merge rename and empty-return logic that were never updated. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add comprehensive tests for RebatchingDataLoader helpers (_squeeze_chunks, _cat_chunks, _buf_len, _buf_index, _buf_slice, _tensor_rebatch, _prefetch, _pin_batches) and end-to-end integration. Add dataset tests for contact_faces_original_nm, affinity noise, channel masking, mean affinity mode, contact_label=None, and missing config key skipping. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Regenerate requirements.modules.txt to include pointnet (needed for Docker image builds). Update internal submodule with configurable optimizer for ContactMergeRegime. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… subprocesses The builder ctx mngr in building.py sets CURRENT_BUILD_SPEC to the full JSON-serialized spec, which reaches several MB for large training configs. Leaving it in os.environ caused execve-based subprocesses (wandb-core, DDP workers, spawn DataLoader workers) to inherit the bloated env and crash with "Argument list too long". Pop before consuming; downstream code uses ZETTA_RUN_SPEC_PATH or trainer.log_config. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…and dict-return from mesh.exists
- gt_dominant_ref_{a,b}: build via pd.array(..., dtype="UInt64") instead of
Series.map(); the map() path coerced None-bearing columns to float64,
silently rounding IDs above 2^53. Old parquets need regeneration.
- Skip mesh.exists call when info["mesh"] is None (saves a GCS RTT) and
set has_mesh_* = False.
- mesh_cv.mesh.exists() returns dict for some source classes; iterate
dict.items() to avoid the bare-zip bug that marked every seg as
has-mesh on dict returns.
- Add asserts in SegContactOp / AddPointcloudsOp on layers without a mesh
field, with actionable guidance (run igneous meshing or use
collect_filter_stats_only=True).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ex backslashes in chunk-exclude JS
- Detect parquets written before the UInt64 fix (gt_dominant_ref_{a,b}
with float dtype) and warn that GT IDs > 2^53 are rounded and NGL
links point at nonexistent segment IDs.
- Escape \\d and \\s in the parseChunkExcludeInput regex literals (the
Python f-string was eating the single backslash).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pulls in MEC seg-contacts spec + train/inference spec updates. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
ValCheckIntervalGuard,_get_worker_init_fnwith CUDA disablemax_random_chunksoptionload_and_run_modelseg_contact_dataset.pyanddata_loader.py)Submodule PRs
Test plan
🤖 Generated with Claude Code