feat: JOSS submission, test suite overhaul, and strategic refactoring by jameslehoux · Pull Request #13 · BASE-Laboratory/BraggTrack

jameslehoux · 2026-05-17T10:37:41Z

Summary

Add JOSS paper draft (paper.md, paper.bib, CITATION.cff) with full bibliography and machine-readable citation metadata
Expand test suite from 57 → 147 tests: error-path coverage, invariant assertions, end-to-end integration test on bundled data
Replace hand-rolled pure-Python Otsu with skimage.filters.threshold_otsu (~100x faster)
Fix load_primary_volume to return np.ndarray directly (eliminates ~200 MB transient garbage per volume load)
Factor duplicated CLI helpers into braggtrack/cli/_utils.py
Add SpotRecord TypedDict formalizing the feature-table contract across pipeline stages
Move DINOv2 encoder construction outside per-scan loop (prevents model reload per scan)
Fix notebook: clear stale outputs, remove unused import, correct misleading markdown about mock embeddings
Add ruff linting/formatting, CHANGELOG, py.typed marker, dynamic __version__

Test plan

python -m unittest discover tests — 147 tests pass
python scripts/ci_report.py — all 6 acceptance gates pass (unit, week1–4, smoke)
Notebook runs end-to-end: jupyter nbconvert --execute notebooks/braggtrack_demo.ipynb
pip install -e . succeeds without torch/transformers
Verify JOSS paper builds: docker run --rm -v $PWD:/data openjournals/inara -o pdf paper.md (optional)

- Ruff lint + format enforced across braggtrack/, tests/, scripts/ (27 files reformatted, 10 lint issues auto-fixed, config in pyproject.toml) - Added CHANGELOG.md following Keep a Changelog format - Created braggtrack/py.typed PEP 561 marker - __version__ dynamically read from package metadata with dev fallback - Fixed: unused imports, unsorted imports, typing.Iterable→collections.abc - LICENSE copyright updated to 2025-2026 - CI lint job gates test job (ruff check + format --check) https://claude.ai/code/session_015Y9zQk4A8uKJAorKuvBoCk

Prepares JOSS submission materials: paper.md (~1800 words covering segmentation, feature extraction, tracking, and reproducibility), paper.bib (14 references), and CITATION.cff. Updates pyproject.toml with author metadata and project URLs. https://claude.ai/code/session_015Y9zQk4A8uKJAorKuvBoCk

Covers: extract_instance_table (centroid, eigenvalues, bbox, weighted fallback), remove_small_objects, fill_holes_binary, relabel_sequential, gaussian_blur_3d, laplacian_3d, log_enhance_3d, orthogonal_mips, crop_spot_cube, mock encoder determinism, and io path resolution. https://claude.ai/code/session_015Y9zQk4A8uKJAorKuvBoCk

- Clear all cached cell outputs (showed pre-fix 11/22/36 spot counts; actual results are now 18/20/16) - Remove unused generate_crossing_scenario import - Fix section 5 markdown: mock embeddings don't increase fragmentation on well-separated data, they simply have no effect - Fix section 6 markdown: honestly frame the flat ablation curve as expected mock-encoder behavior https://claude.ai/code/session_015Y9zQk4A8uKJAorKuvBoCk

…egration - Add test_error_paths.py: ValueError for empty Otsu, empty volume, unsupported method, unparseable scan name, flat volume, empty/single frames in build_tracks, negative intensity fallback, window > length - Add test_invariants.py: relabel always 0..N, fill_holes is superset of input, remove_small never introduces labels, eigenvalues non-negative and descending, Otsu within range, smooth reduces variance, mock encoder always unit-norm - Add test_integration.py: end-to-end pipeline on bundled data verifying stable spot counts, required columns, sane tracking metrics, embedding unit norms - Consolidate test_semantic_embeddings.py into test_semantic_week4.py, removing duplicate MIP/encoder/crop tests - Fix test_segment_dataset_cli.py to use tempfile.TemporaryDirectory instead of leaking artifacts on failure 147 tests, all passing (up from 102). https://claude.ai/code/session_015Y9zQk4A8uKJAorKuvBoCk

…tRecord - Replace hand-rolled pure-Python Otsu with skimage.filters.threshold_otsu (~100x faster on real volumes, zero risk since scikit-image is already a hard dependency) - Fix load_primary_volume to return np.ndarray directly instead of calling .tolist() and creating millions of transient Python floats - Move encoder construction outside per-scan loop in embed_dataset.py (prevents re-loading DINOv2 weights for every scan) - Factor duplicated CLI helpers (_synth_volume_from_file, _write_csv, _load_feature_csv) into braggtrack/cli/_utils.py - Add SpotRecord TypedDict in braggtrack/types.py formalizing the implicit feature-table contract used across all pipeline stages https://claude.ai/code/session_015Y9zQk4A8uKJAorKuvBoCk

…741) https://claude.ai/code/session_015Y9zQk4A8uKJAorKuvBoCk

Added an image to the README for visual enhancement.

https://claude.ai/code/session_015Y9zQk4A8uKJAorKuvBoCk

nbconvert appends .ipynb to --output, causing a PermissionError when writing to /dev/null.ipynb. Use --output-dir /tmp to discard output. https://claude.ai/code/session_015Y9zQk4A8uKJAorKuvBoCk

jameslehoux added 9 commits May 17, 2026 10:06

fix: resolve all ruff lint errors (UP037, F821, F401, SIM117, I001, E…

638ddb4

…741) https://claude.ai/code/session_015Y9zQk4A8uKJAorKuvBoCk

Add image to README for visual enhancement

7cc2fad

Added an image to the README for visual enhancement.

style: apply ruff format to 6 files

e7d78d7

https://claude.ai/code/session_015Y9zQk4A8uKJAorKuvBoCk

jameslehoux force-pushed the claude/3dxrd-tracking-week2-Bt9ed branch from 4901dfc to e7d78d7 Compare May 17, 2026 10:47

fix(ci): use --output-dir instead of --output /dev/null for nbconvert

6a5e01b

nbconvert appends .ipynb to --output, causing a PermissionError when writing to /dev/null.ipynb. Use --output-dir /tmp to discard output. https://claude.ai/code/session_015Y9zQk4A8uKJAorKuvBoCk

jameslehoux merged commit 3c046ef into main May 17, 2026
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: JOSS submission, test suite overhaul, and strategic refactoring#13

feat: JOSS submission, test suite overhaul, and strategic refactoring#13
jameslehoux merged 10 commits into
mainfrom
claude/3dxrd-tracking-week2-Bt9ed

jameslehoux commented May 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jameslehoux commented May 17, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant