|
| 1 | +# CLAUDE.md |
| 2 | + |
| 3 | +This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. |
| 4 | + |
| 5 | +## Project Overview |
| 6 | + |
| 7 | +**cell-eval** is a Python package and CLI tool for evaluating the performance of models that predict cellular responses to perturbations at the single-cell level. Developed by the Arc Research Institute. |
| 8 | + |
| 9 | +It generally revolves around a *real* anndata and a *predicted* anndata where it measures the general differences between the two across a variety of metrics. |
| 10 | + |
| 11 | +- Python 3.11–3.12, managed with **UV** and built with **hatchling** |
| 12 | +- CLI entry point: `cell-eval` (defined in `src/cell_eval/__main__.py`) |
| 13 | + |
| 14 | +## Common Commands |
| 15 | + |
| 16 | +```bash |
| 17 | +# Install dependencies |
| 18 | +uv sync --all-extras --dev |
| 19 | + |
| 20 | +# Run all tests |
| 21 | +uv run pytest -v |
| 22 | + |
| 23 | +# Run a single test |
| 24 | +uv run pytest tests/test_eval.py::test_broken_adata_not_normlog -v |
| 25 | + |
| 26 | +# Formatting (check / fix) |
| 27 | +uv run ruff format --check |
| 28 | +uv run ruff format |
| 29 | + |
| 30 | +# Type checking |
| 31 | +uv run ty check |
| 32 | + |
| 33 | +# Verify CLI works |
| 34 | +uv run cell-eval --help |
| 35 | +``` |
| 36 | + |
| 37 | +CI runs: formatting, typing, pytest, and cli-test (see `.github/workflows/CI.yml`). |
| 38 | + |
| 39 | +## Architecture |
| 40 | + |
| 41 | +### Core Data Flow |
| 42 | + |
| 43 | +``` |
| 44 | +AnnData inputs (predicted + real) |
| 45 | + → MetricsEvaluator (validation, normalization, DE computation) |
| 46 | + → MetricPipeline (profile-based metric selection + execution) |
| 47 | + → metrics_registry (global MetricRegistry instance) |
| 48 | + → individual metric functions |
| 49 | + → polars DataFrames (per-perturbation + aggregated results) |
| 50 | +``` |
| 51 | + |
| 52 | +### Key Abstractions |
| 53 | + |
| 54 | +- **`MetricsEvaluator`** (`src/cell_eval/_evaluator.py`) — Main programmatic entry point. Validates input AnnData objects, computes differential expression via `pdex`, and orchestrates the metric pipeline. |
| 55 | + |
| 56 | +- **`MetricRegistry`** (`src/cell_eval/metrics/_registry.py`) — Global singleton `metrics_registry`. Metrics are registered with a name, type (`DE` or `ANNDATA_PAIR`), compute function, and best-value indicator. Supports both plain functions and class-based metrics requiring instantiation. |
| 57 | + |
| 58 | +- **`MetricPipeline`** (`src/cell_eval/_pipeline/_runner.py`) — Selects and runs metrics based on a profile (`full`, `minimal`, `vcc`, `de`, `anndata`, `pds`). Collects per-perturbation results and aggregates them. |
| 59 | + |
| 60 | +- **`Metric` protocol** (`src/cell_eval/metrics/base.py`) — All metric functions take either a `PerturbationAnndataPair` or `DEComparison` and return `float | dict[str, float]`. |
| 61 | + |
| 62 | +- **Type system** (`src/cell_eval/_types/`) — Immutable dataclasses: `PerturbationAnndataPair`, `DEComparison`, plus enums `MetricType`, `MetricBestValue`, `DESortBy`. |
| 63 | + |
| 64 | +### Metrics |
| 65 | + |
| 66 | +Metrics are split into two categories registered in `src/cell_eval/metrics/_impl.py`: |
| 67 | + |
| 68 | +- **AnnData metrics** (`_anndata.py`): pearson_delta, mse, mae, mse_delta, mae_delta, discrimination_score, clustering_agreement, edistance |
| 69 | +- **DE metrics** (`_de.py`): overlap/precision at N, spearman correlations, direction match, significant gene recall, ROC/PR AUC |
| 70 | + |
| 71 | +### CLI |
| 72 | + |
| 73 | +Subcommands in `src/cell_eval/_cli/`: `prep` (data preparation for VCC), `run` (evaluation), `baseline` (create baseline), `score` (normalize against baseline). CLI defaults are in `_cli/_const.py`. |
| 74 | + |
| 75 | +### Test Data Utilities |
| 76 | + |
| 77 | +`cell_eval.data` provides `build_random_anndata()` and `downsample_cells()` for generating synthetic AnnData objects in tests. |
| 78 | + |
| 79 | +## Conventions |
| 80 | + |
| 81 | +- Uses `polars` (not pandas) for DataFrames |
| 82 | +- Uses `match`/`case` statements (Python 3.10+ syntax) |
| 83 | +- Type hints throughout; PEP 561 `py.typed` marker present |
| 84 | +- Private modules prefixed with `_` (public API is re-exported from `__init__.py`) |
0 commit comments