fix: stabilize DINO segmentation with global PCA and foreground-aware… by jameslehoux · Pull Request #16 · BASE-Laboratory/BraggTrack

jameslehoux · 2026-05-18T08:27:24Z

… clustering

Three structural issues caused wildly inconsistent spot counts across scans:

Per-slice PCA fitted different feature bases per slice, so HDBSCAN produced different cluster counts even for similar content
Background patches participated in PCA and clustering, diluting signal
min_cluster_size=3 allowed spurious micro-clusters

Fixes: fit a single PCA across all foreground patches from all slices, mask out pure-background patches before clustering, raise min_cluster_size default 3→5, lower min_overlap_fraction 0.3→0.2.

https://claude.ai/code/session_015Y9zQk4A8uKJAorKuvBoCk

… clustering Three structural issues caused wildly inconsistent spot counts across scans: - Per-slice PCA fitted different feature bases per slice, so HDBSCAN produced different cluster counts even for similar content - Background patches participated in PCA and clustering, diluting signal - min_cluster_size=3 allowed spurious micro-clusters Fixes: fit a single PCA across all foreground patches from all slices, mask out pure-background patches before clustering, raise min_cluster_size default 3→5, lower min_overlap_fraction 0.3→0.2. https://claude.ai/code/session_015Y9zQk4A8uKJAorKuvBoCk

The model name facebook/dinov3-vitb16-pretrain-lvd1689m is a real HuggingFace checkpoint (released 2025-08-13). Revert the incorrect rename to dinov2-small. Key fixes for the torch backend: - Skip register tokens (4 in DINOv3) when extracting patch features, not just the CLS token - Update default patch_size from 14 to 16 (DINOv3 uses 16x16 patches) - Read num_register_tokens from model config for forward compat Also update multiview encoder default model to DINOv3. https://claude.ai/code/session_015Y9zQk4A8uKJAorKuvBoCk

The comparison notebook now installs torch/transformers on Colab and uses backend="auto" so it runs real DINOv3 ViT-B/16 weights when a GPU is available, falling back to mock for local/CI environments. https://claude.ai/code/session_015Y9zQk4A8uKJAorKuvBoCk

jameslehoux added 3 commits May 18, 2026 08:25

jameslehoux merged commit 5052c47 into main May 18, 2026
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: stabilize DINO segmentation with global PCA and foreground-aware…#16

fix: stabilize DINO segmentation with global PCA and foreground-aware…#16
jameslehoux merged 3 commits into
mainfrom
claude/3dxrd-tracking-week2-Bt9ed

jameslehoux commented May 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jameslehoux commented May 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant