Enhance segmentation with DINO backend and kinematic tracking by jameslehoux · Pull Request #14 · BASE-Laboratory/BraggTrack

jameslehoux · 2026-05-18T07:12:49Z

No description provided.

Add merge_nearby_labels() post-processing step that greedily merges adjacent watershed fragments whose intensity-weighted centroids are within a configurable distance. This addresses over-segmentation where a single Bragg spot gets split into multiple watershed basins. Parameter changes (data-driven from sweep on bundled scans): - min_seed_separation: 1 → 2 (halves over-splitting, spread=1) - CLI adds --threshold-fraction and --merge-distance (default 15) - Notebook segment() uses merge_nearby_labels with distance=15 Before: 35/36/33 spots across 3 scans (spread=3, over-segmented) After: 18/17/16 spots across 3 scans (spread=2, better consistency) https://claude.ai/code/session_015Y9zQk4A8uKJAorKuvBoCk

DINOv3 patch-level features → PCA → HDBSCAN clustering → 3D slice stitching via union-find, replacing hand-tuned intensity-domain parameters with learned feature-space representations that generalise across beamlines and detectors. - Add PatchFeatureEncoder protocol + MockPatchEncoder + TorchDinoPatchEncoder - Add segment_dino() pipeline with slice extraction, clustering, upsampling, 3D stitching, and Otsu foreground masking - Wire --method classical|dino flag and DINO-specific CLI args - Add scikit-learn>=1.3 to core dependencies (for HDBSCAN) - Add 16 tests covering mock encoder, clustering, stitching, and end-to-end https://claude.ai/code/session_015Y9zQk4A8uKJAorKuvBoCk

Runs `--method dino --dino-backend mock` on bundled sample data and validates artifacts (labels, features, summary) for all 3 scans. Also fixes HDBSCAN on tiny volumes where few patches exist — treats single-patch slices as one region instead of returning empty labels. - Add scripts/check_dino_acceptance.py (mirrors check_week2_acceptance) - Add tests/test_dino_acceptance.py - Wire DINO acceptance gate into scripts/ci_report.py - Fix _cluster_feature_map to handle small volumes gracefully https://claude.ai/code/session_015Y9zQk4A8uKJAorKuvBoCk

Side-by-side comparison on bundled sample data: spot counts, tri-axis label projections, feature distributions, Dice overlap, centroid scatter, and cross-scan consistency metrics. Uses mock backend by default. https://claude.ai/code/session_015Y9zQk4A8uKJAorKuvBoCk

- Type encoder param as PatchFeatureEncoder instead of object - Type backend param as BackendName instead of str - Vectorize _upsample_labels with np.repeat (was Python double-loop) - Pre-compute label sizes in _stitch_slices_3d (avoids O(n_pairs*n_pixels)) - Eliminate double Otsu in CLI — compute threshold only in classical branch - Use result.threshold in summary JSON (works for both methods) - Make response field zero-size array (no LoG response in DINO path) - Clarify slice_hw with if/elif/else instead of nested ternary - Remove os.environ side effect from test file https://claude.ai/code/session_015Y9zQk4A8uKJAorKuvBoCk

Compute physically meaningful evolution quantities for each tracked grain: strain (Δd/d₀), misorientation (angular drift in μ and χ), growth/dissolution (relative intensity and volume changes), and shape evolution (anisotropy, covariance trace). Includes summary statistics and flat-table export. https://claude.ai/code/session_015Y9zQk4A8uKJAorKuvBoCk

Deterministic 4-grain scenario with analytically known evolution: - Grain A: linear elastic loading (strain = 0.1%/step) - Grain B: pure rotation (0.5°/step μ, 0.3°/step χ, no strain) - Grain C: dissolution (linear intensity/volume decay) - Grain D: late nucleation (born scan 2, growing) Verifies exact recovery of strain, misorientation, growth/dissolution, shape metrics, and summary statistics through the full pipeline. https://claude.ai/code/session_015Y9zQk4A8uKJAorKuvBoCk

1. Remove all "Week N" references from docstrings, artifact paths, schema versions, script names, and test names. Replace with descriptive names: segmentation, tracking, embedding, io. 2. Fix broken import in ablation script (was importing non-existent _load_feature_csv from track_dataset; now imports load_feature_csv from _utils). 3. Remove no-op assignment in otsu.py smooth_thresholds (line that assigned smoothed[outlier] = smoothed[outlier]). 4. Add tests/conftest.py with shared make_spot() fixture factory, deduplicate _spot() helpers across test files. 5. Standardize CLI arg naming: track_dataset now uses "root" instead of "indir" to match all other CLI modules. 6. Extract _write_notebook() to shared write_qc_notebook() in cli/_utils.py, eliminating duplication between segment_dataset and track_dataset. Artifact paths: week2 → segmentation, week3 → tracking, week4 → embedding Schema versions: week2.v1 → segmentation.v1, week3.v1 → tracking.v1, week4.v1 → embedding.v1/tracking_semantic.v1 Script renames: check_week2_* → check_segmentation_*, etc. Test renames: test_week1_* → test_io_*, test_week2_* → test_segmentation_*, etc. https://claude.ai/code/session_015Y9zQk4A8uKJAorKuvBoCk

jameslehoux added 8 commits May 17, 2026 14:30

jameslehoux merged commit 1dc96d3 into main May 18, 2026
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhance segmentation with DINO backend and kinematic tracking#14

Enhance segmentation with DINO backend and kinematic tracking#14
jameslehoux merged 8 commits into
mainfrom
claude/3dxrd-tracking-week2-Bt9ed

jameslehoux commented May 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jameslehoux commented May 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant