Skip to content

Latest commit

 

History

History
76 lines (51 loc) · 3.52 KB

File metadata and controls

76 lines (51 loc) · 3.52 KB

ModelArrayIO

Latest Version PyPI - Python Version License Documentation Status GitHub Actions: Tox Codecov Code style: ruff

ModelArrayIO is a Python package that converts between neuroimaging formats (fixel .mif, voxel NIfTI, CIFTI-2 dscalar) and the HDF5 (.h5) layout used by the R package ModelArray. It can also write ModelArray statistical results back to imaging formats.

Relationship to ConFixel: The earlier project ConFixel is superseded by ModelArrayIO. The ConFixel repository is retained for history (including links from publications) and will be archived; new work should use this repository.

Documentation for installation and usage: ModelArrayIO on GitHub (this README). For conda, HDF5 libraries, and installing the ModelArray R package, see the ModelArray vignette Installation.

Overview

ModelArrayIO provides three converter areas, each with import and export commands:

Once ModelArrayIO is installed, these commands are available in your terminal:

  • Neuroimaging data (CIFTI, NIfTI, or MRtrix .mif):
    • Neuroimaging → .h5: modelarrayio to-modelarray
    • .h5 → Neuroimaging: modelarrayio export-results

Storage backends: HDF5 and TileDB

ModelArrayIO supports two on-disk backends for the subject-by-element matrix:

  • HDF5 (default), implemented in modelarrayio/h5_storage.py
  • TileDB, implemented in modelarrayio/tiledb_storage.py

Both backends expose a similar API:

  • create a dense 2D array (subjects, items) and write all values at once
  • create an empty array with the same shape and write by column stripes
  • write/read column names alongside the data

Notes and minor differences:

  • Chunking vs tiling: HDF5 uses chunks; TileDB uses tiles. We compute tile sizes analogous to chunk sizes to keep write/read patterns similar.
  • Compression: HDF5 uses gzip by default; TileDB defaults to zstd with shuffle for better speed/ratio. You can switch to gzip for parity.
  • Metadata: HDF5 stores column_names as a dataset attribute; TileDB stores names as JSON metadata on the array/group.
  • Layout: Both backends keep dimensions in the same order and use zero-based indices.