feat: JOSS submission, test suite overhaul, and strategic refactoring#13
Merged
Conversation
- Ruff lint + format enforced across braggtrack/, tests/, scripts/ (27 files reformatted, 10 lint issues auto-fixed, config in pyproject.toml) - Added CHANGELOG.md following Keep a Changelog format - Created braggtrack/py.typed PEP 561 marker - __version__ dynamically read from package metadata with dev fallback - Fixed: unused imports, unsorted imports, typing.Iterable→collections.abc - LICENSE copyright updated to 2025-2026 - CI lint job gates test job (ruff check + format --check) https://claude.ai/code/session_015Y9zQk4A8uKJAorKuvBoCk
Prepares JOSS submission materials: paper.md (~1800 words covering segmentation, feature extraction, tracking, and reproducibility), paper.bib (14 references), and CITATION.cff. Updates pyproject.toml with author metadata and project URLs. https://claude.ai/code/session_015Y9zQk4A8uKJAorKuvBoCk
Covers: extract_instance_table (centroid, eigenvalues, bbox, weighted fallback), remove_small_objects, fill_holes_binary, relabel_sequential, gaussian_blur_3d, laplacian_3d, log_enhance_3d, orthogonal_mips, crop_spot_cube, mock encoder determinism, and io path resolution. https://claude.ai/code/session_015Y9zQk4A8uKJAorKuvBoCk
- Clear all cached cell outputs (showed pre-fix 11/22/36 spot counts; actual results are now 18/20/16) - Remove unused generate_crossing_scenario import - Fix section 5 markdown: mock embeddings don't increase fragmentation on well-separated data, they simply have no effect - Fix section 6 markdown: honestly frame the flat ablation curve as expected mock-encoder behavior https://claude.ai/code/session_015Y9zQk4A8uKJAorKuvBoCk
…egration - Add test_error_paths.py: ValueError for empty Otsu, empty volume, unsupported method, unparseable scan name, flat volume, empty/single frames in build_tracks, negative intensity fallback, window > length - Add test_invariants.py: relabel always 0..N, fill_holes is superset of input, remove_small never introduces labels, eigenvalues non-negative and descending, Otsu within range, smooth reduces variance, mock encoder always unit-norm - Add test_integration.py: end-to-end pipeline on bundled data verifying stable spot counts, required columns, sane tracking metrics, embedding unit norms - Consolidate test_semantic_embeddings.py into test_semantic_week4.py, removing duplicate MIP/encoder/crop tests - Fix test_segment_dataset_cli.py to use tempfile.TemporaryDirectory instead of leaking artifacts on failure 147 tests, all passing (up from 102). https://claude.ai/code/session_015Y9zQk4A8uKJAorKuvBoCk
…tRecord - Replace hand-rolled pure-Python Otsu with skimage.filters.threshold_otsu (~100x faster on real volumes, zero risk since scikit-image is already a hard dependency) - Fix load_primary_volume to return np.ndarray directly instead of calling .tolist() and creating millions of transient Python floats - Move encoder construction outside per-scan loop in embed_dataset.py (prevents re-loading DINOv2 weights for every scan) - Factor duplicated CLI helpers (_synth_volume_from_file, _write_csv, _load_feature_csv) into braggtrack/cli/_utils.py - Add SpotRecord TypedDict in braggtrack/types.py formalizing the implicit feature-table contract used across all pipeline stages https://claude.ai/code/session_015Y9zQk4A8uKJAorKuvBoCk
Added an image to the README for visual enhancement.
4901dfc to
e7d78d7
Compare
nbconvert appends .ipynb to --output, causing a PermissionError when writing to /dev/null.ipynb. Use --output-dir /tmp to discard output. https://claude.ai/code/session_015Y9zQk4A8uKJAorKuvBoCk
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
paper.md,paper.bib,CITATION.cff) with full bibliography and machine-readable citation metadataskimage.filters.threshold_otsu(~100x faster)load_primary_volumeto returnnp.ndarraydirectly (eliminates ~200 MB transient garbage per volume load)braggtrack/cli/_utils.pySpotRecordTypedDict formalizing the feature-table contract across pipeline stagespy.typedmarker, dynamic__version__Test plan
python -m unittest discover tests— 147 tests passpython scripts/ci_report.py— all 6 acceptance gates pass (unit, week1–4, smoke)jupyter nbconvert --execute notebooks/braggtrack_demo.ipynbpip install -e .succeeds without torch/transformersdocker run --rm -v $PWD:/data openjournals/inara -o pdf paper.md(optional)