Skip to content

saeyslab/harpy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

915 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Harpy logo

Single-cell spatial omics analysis that makes you happy.

PyPI Release Status Build Status Test Status pre-commit.ci status documentation badge codecov Downloads License GitHub repo size Zenodo Paper

Documentation · Quick Start · Tutorials · Harpy Vitessce

💫 If you find Harpy useful, please give us a ! It helps others discover the project and supports continued development.

Why Harpy?

Harpy is a spatial omics analysis library for spatial transcriptomics and proteomics. Within the scverse stack, it bridges SpatialData and downstream analysis tools such as AnnData, Scanpy, and Squidpy. It provides scalable, image- and geometry-aware computation to transform raw spatial data into analysis-ready representations, with a strong emphasis on interoperability and large-scale workflows.

In practice, Harpy offers fast, out-of-core image preprocessing, tiled segmentation, along with efficient aggregation workflows to generate AnnData tables and compute per-cell features from images, segmentation masks, and transcript coordinates. It also supports deep feature extraction, pixel- and cell-level clustering, and the construction of single-cell representations from highly multiplexed images.

  • Multi-platform support for spatial transcriptomics and proteomics data.
  • Interoperable outputs built on SpatialData.
  • Scales to (very) large images: tiled workflows with Dask; optional GPU acceleration with CuPy and PyTorch.
  • Scalable computational building blocks for segmentation, feature extraction, clustering, and spatial analysis.

Installation

pip install harpy-analysis

With extras

pip install "harpy-analysis[extra]"

[extra] installs optional dependencies for:

  • Segmentation: cellpose
  • OpenCV support: opencv-python-headless
  • FlowSOM Clustering: flowsom
  • Notebook workflows: ipywidgets, tqdm, bokeh, textalloc, joypy, supervenn, nbconvert, ipython
  • CLI workflows: hydra-core

With extras and napari

pip install "harpy-analysis[extra,napari]"

[napari] adds:

  • napari[all]
  • napari-spatialdata

Only for developers. Clone this repository locally, install the .[dev] instead of the [extra] dependencies and read the contribution guide.

# Clone repository from GitHub
uv venv --python=3.12  # create venv, set python version (>=3.11)
source .venv/bin/activate  # activate the virtual environment
uv pip install -e '.[dev]'  # editable install with dev tooling
python -c 'import harpy; print(harpy.__version__)'  # check if the package is installed
# make changes
python -m pytest  # run the tests

It is possible to install Harpy using Anaconda although we recommend uv, see the installation guide.

Quickstart

See the short, runnable guide.

🧭 Tutorials and Guides

Explore how to use Harpy for segmentation, shallow and deep feature extraction, clustering, and spatial analysis of gigapixel-scale multiplexed data with these step-by-step notebooks:

  • 🚀 Basic Usage of Harpy

    Learn how to read in data, perform tiled segmentation using Cellpose and Dask-CUDA, extract features, perform QC and analyze results downstream with Scanpy and Squidpy.

    👉 Tutorial image based transcriptomics, Human Ovarian Cancer, Xenium 10x Genomics

    👉 Tutorial proteomics, MACSima

  • 🔧 Technology-specific advice

    Learn which technologies Harpy supports. 👉 Tutorial

  • 🧩 Pixel and Cell Clustering

    Learn how to perform unsupervised pixel- and cell-level clustering using Harpy together with FlowSOM. 👉 Tutorial

  • ✂️ Cell Segmentation

    Explore segmentation workflows in Harpy using different tools:

    💡 Want us to add support for another segmentation method? 👉 Open an issue and let us know!

  • 🧪 Single-cell representations from highly multiplexed images and downstream use with PyTorch

    Learn how single-cell representations can be generated from highly multiplexed images. These representations can then be used downstream to train classifiers in PyTorch. 👉 Tutorial

  • 🧠 Deep Feature Extraction

    Discover how Harpy enables fast, scalable extraction of deep, cell-level features from multiplex imaging data with the KRONOS foundation model for proteomics. 👉 Tutorial

    💡 Want us to add support for another deep feature extraction method? 👉 Open an issue and let us know!

  • 🔬 Shallow Feature Extraction

    Learn to extract shallow features—such as mean, median, and standard deviation of intensities—from multiplex imaging data with Harpy. 👉 Tutorial

  • 🧬 Spatial Transcriptomics

    Learn how to analyze spatial transcriptomics data with Harpy.

    👉 Tutorial (Mouse Liver, Resolve Molecular Cartography)

    👉 Tutorial (Human Ovarian Cancer, Xenium 10x Genomics)


  • 🌐 Multiple samples and coordinate systems

    Learn how to work with multiple samples, intrinsic and micron coordinates. 👉 Tutorial


  • 📐 Rasterize and vectorize labels and shapes

    Learn how to convert a segmentation mask (array) into its vectorized form, and segmentation boundaries (polygons) into their rasterized equivalents. This conversion is useful, for example, when integrating annotations (e.g., from QuPath) into downstream spatial omics analysis.👉 Tutorial


📚 For a complete list of tutorials, visit the Harpy documentation.

Computational benchmark

Explore the benchmark performance of Harpy on a large MACSima tonsil proteomics dataset. 👉 Results

Contributing

See the contribution guide for info on how to contribute to Harpy.

Citation

If you use Harpy in your work, please cite:

Benjamin Rombaut, Arne Defauw, Frank Vernaillen, Julien Mortier, Evelien Van Hamme, Sofie Van Gassen, Ruth Seurinck, Yvan Saeys. Scalable analysis of whole slide spatial proteomics with Harpy. Bioinformatics (2026), btag122. https://doi.org/10.1093/bioinformatics/btag122

If you use Harpy for spatial transcriptomics analysis, please cite:

Lotte Pollaris, Bavo Vanneste, Benjamin Rombaut, Arne Defauw, Frank Vernaillen, Julien Mortier, Wout Vanhenden, Liesbet Martens, Tinne Thone, Jean-Francois Hastir, Anna Bujko, Wouter Saelens, Jean-Christophe Marine, Hilde Nelissen, Evelien Van Hamme, Ruth Seurinck, Charlotte L. Scott, Martin Guilliams, Yvan Saeys. SPArrOW: a flexible, interactive and scalable pipeline for spatial transcriptomics analysis. https://doi.org/10.1101/2024.07.04.601829

License

Check the license. Harpy is free for academic usage. For commercial usage, please contact Saeyslab.

Issues

If you encounter any problems, please file an issue along with a detailed description.

Packages

 
 
 

Contributors