From 4115023926d62226da121cc80a24e20da14bfa89 Mon Sep 17 00:00:00 2001
From: Lukas Heumos <lukas.heumos@posteo.net>
Date: Mon, 25 May 2026 11:51:43 +0200
Subject: [PATCH 1/4] de: clarify model choice, design matrices, and result
 interpretation

Addresses several of the open narrative TODOs in scverse/pertpy#615
without restructuring the notebook:

- Expand the intro to spell out the three families of DE models in
  pertpy (simple tests, pseudobulk + GLM, generic linear models) and
  when to pick which.
- Replace the partial "<!-- ... -->" lead-in to the edgeR section with
  a short explainer of design matrices and Wilkinson-formula syntax,
  using the model fitted right below as the running example.
- Replace the one-liner before `test_contrasts` with a brief contrast
  primer so readers understand what `contrast()` actually returns.
- Add a result-table interpretation cell that names every column the
  reader will see (`log_fc`, `adj_p_value`, ...) and the standard
  significance rule of thumb.
- Point readers to decoupler's pathway-enrichment workflow as the
  natural next step after producing a ranked gene table.

No code cells were modified; existing outputs are untouched.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 differential_gene_expression.ipynb | 53 +++++++++++++++++++++++++-----
 1 file changed, 44 insertions(+), 9 deletions(-)

diff --git a/differential_gene_expression.ipynb b/differential_gene_expression.ipynb
index 5874f3d..4550632 100644
--- a/differential_gene_expression.ipynb
+++ b/differential_gene_expression.ipynb
@@ -12,13 +12,15 @@
    "metadata": {},
    "source": [
     "Differential gene expression (DGE) analysis identifies genes that show statistically significant differences in expression levels across distinct cell populations or conditions.\n",
-    "This analysis helps in identifying which cell types are most affected by a condition of interest such as a disease, and characterizing their functional signatures. \n",
+    "This analysis helps in identifying which cell types are most affected by a condition of interest such as a disease, and characterizing their functional signatures.\n",
     "\n",
-    "Pertpy provides an API to access several types of models for differential expression analysis.\n",
-    "The first group of models comprises the T-test and Wilcoxon test as simple statistical tests for comparing expression values between two groups without covariates.\n",
-    "The second group includes models of the linear model family that allow modeling complex designs and contrasts. Currently included are [PyDESeq2](https://academic.oup.com/bioinformatics/article/39/9/btad547/7260507), [edgeR](https://academic.oup.com/bioinformatics/article/26/1/139/182458) as well as a wrapper around statsmodels [Statsmodels](https://www.statsmodels.org). which provides access to a wide range of regression models, including ordinary least squares regression, robust linear models and generalized linear models.\n",
+    "Pertpy provides a unified API to several families of DGE models so you can pick the one that fits your design:\n",
     "\n",
-    "In the following tutorial we will demonstrate how the edgeR interface can be used to model complex interactions using the triple-negative breast cancer (TNBC) [Zhang dataset](https://www.sciencedirect.com/science/article/pii/S1535610821004992)."
+    "- **Simple statistical tests** ([`TTest`](https://pertpy.readthedocs.io/en/stable/api/tools/pertpy.tools.TTest.html), [`WilcoxonTest`](https://pertpy.readthedocs.io/en/stable/api/tools/pertpy.tools.WilcoxonTest.html)) compare two groups directly on the expression matrix. They are fast, assumption-light, and a reasonable choice when you only have a single binary condition and no covariates to adjust for.\n",
+    "- **Pseudobulk + count-based GLMs** ([`EdgeR`](https://pertpy.readthedocs.io/en/stable/api/tools/pertpy.tools.EdgeR.html), [`PyDESeq2`](https://pertpy.readthedocs.io/en/stable/api/tools/pertpy.tools.PyDESeq2.html)) aggregate cells into per-sample pseudobulks and fit a negative-binomial GLM. This is the recommended approach for multi-sample studies with covariates and the [current best practice](https://www.sc-best-practices.org/conditions/differential_gene_expression.html) for scRNA-seq DE because it controls false positives caused by treating individual cells as independent replicates. `EdgeR` calls into the R package via `rpy2`; `PyDESeq2` is a pure-Python reimplementation of DESeq2.\n",
+    "- **Generic linear models** ([`Statsmodels`](https://pertpy.readthedocs.io/en/stable/api/tools/pertpy.tools.Statsmodels.html)) wrap [statsmodels](https://www.statsmodels.org) and expose OLS, robust linear models, and GLMs. Use this when your response variable does not look like a count (for example, log-normalised expression or a continuous score).\n",
+    "\n",
+    "In the following tutorial we will demonstrate how the edgeR and PyDESeq2 interfaces can be used to model complex interactions using the triple-negative breast cancer (TNBC) [Zhang dataset](https://www.sciencedirect.com/science/article/pii/S1535610821004992).\n"
    ]
   },
   {
@@ -948,14 +950,21 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "<!-- The interface for edgeR supports complex designs. -->\n",
-    "Here, we are fit a model to capture that the type of treatment (`Treatment`) and response to the treatment (`Efficacy`) contribute independently to the gene expression levels.\n",
+    "A linear model needs a **design matrix** that encodes, for each sample, the values of the covariates we want to control for or test against.\n",
+    "Pertpy lets you describe the design with a [Wilkinson formula](https://matthewwardrop.github.io/formulaic/latest/formulas/) (the same syntax used by R and patsy).\n",
+    "A few examples:\n",
+    "\n",
+    "- `~ Treatment` &mdash; intercept plus a coefficient per treatment level (the baseline level is absorbed into the intercept).\n",
+    "- `~ Treatment + Efficacy` &mdash; both covariates as **additive**, independent effects.\n",
+    "- `~ Treatment + Efficacy + Treatment:Efficacy` (equivalently `~ Treatment * Efficacy`) &mdash; additive effects plus an **interaction** term that asks whether the effect of one covariate depends on the level of the other.\n",
+    "\n",
+    "Here we start with the additive model: the type of treatment (`Treatment`) and response to the treatment (`Efficacy`) contribute independently to gene expression.\n",
     "By doing so, we can evaluate:\n",
     "\n",
     "1. How much of the variation in gene expression can be attributed to differences in drug efficacy, independent of the type of treatment.\n",
     "2. How much of the variation is due to the type of treatment, independent of the efficacy of the drug.\n",
     "\n",
-    "This setup helps in understanding not just whether a treatment works, but how its effectiveness might vary or be influenced by the inherent efficacy of the drug."
+    "This setup helps in understanding not just whether a treatment works, but how its effectiveness might vary or be influenced by the inherent efficacy of the drug.\n"
    ]
   },
   {
@@ -1012,7 +1021,9 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "To now determine the differentially expressed genes between the treatments, we can specify a contrast as follows:"
+    "Fitting the model gave us one coefficient per term in the design.\n",
+    "To turn those coefficients into a biological comparison we evaluate a **contrast**: a vector with one entry per coefficient that says \"take this combination of fitted effects and test whether it differs from zero\".\n",
+    "For a simple two-group comparison along a single column, `contrast(column=..., baseline=..., group_to_compare=...)` builds the right vector automatically (we will assemble more complex contrasts by hand further down).\n"
    ]
   },
   {
@@ -1209,6 +1220,21 @@
     "res_df.head(10)"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The result table has one row per gene. The key columns to look at:\n",
+    "\n",
+    "- `variable` &mdash; gene symbol.\n",
+    "- `log_fc` &mdash; log2 fold change between the two contrasted groups. Positive values mean the gene is up in `group_to_compare` relative to `baseline`; negative values mean it is down.\n",
+    "- `p_value` &mdash; raw p-value from the test that `log_fc` differs from zero. Useful for diagnostics but **do not** threshold on this directly.\n",
+    "- `adj_p_value` &mdash; p-value after Benjamini–Hochberg adjustment for multiple testing. This is the value to threshold on (commonly `< 0.05`).\n",
+    "- `contrast` &mdash; identifier of the contrast when multiple are tested at once; `None` here because we only ran one contrast.\n",
+    "\n",
+    "A standard summary is \"significantly differentially expressed\" = `adj_p_value < 0.05` **and** `|log_fc|` above some effect-size threshold (the volcano plot below uses `log2fc_thresh` for the latter).\n"
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -1241,6 +1267,15 @@
     "edgr.plot_volcano(res_df, log2fc_thresh=0)"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Once you have a ranked gene table, a typical next step is **gene-set / pathway enrichment** to turn the per-gene statistics into per-pathway statistics.\n",
+    "[decoupler](https://decoupler-py.readthedocs.io/) integrates well with the result tables produced here &mdash; for example, you can pass the `log_fc` column from `res_df` (indexed by `variable`) into `dc.mt.ulm` together with a prior knowledge network from [OmniPath](https://omnipathdb.org/) (`dc.op.collectri`, `dc.op.progeny`, MSigDB hallmark gene sets, &hellip;) to score transcription-factor or pathway activities.\n",
+    "We do not run an enrichment analysis here to keep this tutorial focused on the DE step, but `decoupler`'s [single-sample functional analysis](https://decoupler-py.readthedocs.io/en/latest/notebooks/bulk.html) tutorial is a good starting point.\n"
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {

From 60b0d160b4500c8576ea2e7f5d4b8c383704007c Mon Sep 17 00:00:00 2001
From: Lukas Heumos <lukas.heumos@posteo.net>
Date: Mon, 25 May 2026 12:00:34 +0200
Subject: [PATCH 2/4] de: split sentences per line, switch pertpy refs to
 intersphinx
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Cleanup pass on the previous narrative additions:
- Every sentence in every new markdown cell now lives on its own line,
  matching the source-prose convention used elsewhere in this repo.
- pertpy class references switch from full readthedocs URLs to MyST
  cross-references ({class}`~pertpy.tools.EdgeR`, etc.) so the links
  stay valid as the API moves and render with proper styling.
- {meth}`~pertpy.tools.EdgeR.contrast` is used in place of the inline
  signature.
- decoupler references stay as plain URLs / `code` spans for now —
  decoupler's published objects.inv currently contains no py: entries,
  so intersphinx targets there would fail at build time.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 differential_gene_expression.ipynb | 44 ++++++++++++++++++------------
 1 file changed, 27 insertions(+), 17 deletions(-)

diff --git a/differential_gene_expression.ipynb b/differential_gene_expression.ipynb
index 4550632..8c2c87d 100644
--- a/differential_gene_expression.ipynb
+++ b/differential_gene_expression.ipynb
@@ -16,9 +16,13 @@
     "\n",
     "Pertpy provides a unified API to several families of DGE models so you can pick the one that fits your design:\n",
     "\n",
-    "- **Simple statistical tests** ([`TTest`](https://pertpy.readthedocs.io/en/stable/api/tools/pertpy.tools.TTest.html), [`WilcoxonTest`](https://pertpy.readthedocs.io/en/stable/api/tools/pertpy.tools.WilcoxonTest.html)) compare two groups directly on the expression matrix. They are fast, assumption-light, and a reasonable choice when you only have a single binary condition and no covariates to adjust for.\n",
-    "- **Pseudobulk + count-based GLMs** ([`EdgeR`](https://pertpy.readthedocs.io/en/stable/api/tools/pertpy.tools.EdgeR.html), [`PyDESeq2`](https://pertpy.readthedocs.io/en/stable/api/tools/pertpy.tools.PyDESeq2.html)) aggregate cells into per-sample pseudobulks and fit a negative-binomial GLM. This is the recommended approach for multi-sample studies with covariates and the [current best practice](https://www.sc-best-practices.org/conditions/differential_gene_expression.html) for scRNA-seq DE because it controls false positives caused by treating individual cells as independent replicates. `EdgeR` calls into the R package via `rpy2`; `PyDESeq2` is a pure-Python reimplementation of DESeq2.\n",
-    "- **Generic linear models** ([`Statsmodels`](https://pertpy.readthedocs.io/en/stable/api/tools/pertpy.tools.Statsmodels.html)) wrap [statsmodels](https://www.statsmodels.org) and expose OLS, robust linear models, and GLMs. Use this when your response variable does not look like a count (for example, log-normalised expression or a continuous score).\n",
+    "- **Simple statistical tests** ({class}`~pertpy.tools.TTest`, {class}`~pertpy.tools.WilcoxonTest`) compare two groups directly on the expression matrix.\n",
+    "  They are fast, assumption-light, and a reasonable choice when you only have a single binary condition and no covariates to adjust for.\n",
+    "- **Pseudobulk + count-based GLMs** ({class}`~pertpy.tools.EdgeR`, {class}`~pertpy.tools.PyDESeq2`) aggregate cells into per-sample pseudobulks and fit a negative-binomial GLM.\n",
+    "  This is the recommended approach for multi-sample studies with covariates and the [current best practice](https://www.sc-best-practices.org/conditions/differential_gene_expression.html) for scRNA-seq DE because it controls false positives caused by treating individual cells as independent replicates.\n",
+    "  {class}`~pertpy.tools.EdgeR` calls into the R package via `rpy2`; {class}`~pertpy.tools.PyDESeq2` is a pure-Python reimplementation of DESeq2.\n",
+    "- **Generic linear models** ({class}`~pertpy.tools.Statsmodels`) wrap [statsmodels](https://www.statsmodels.org) and expose OLS, robust linear models, and GLMs.\n",
+    "  Use this when your response variable does not look like a count (for example, log-normalised expression or a continuous score).\n",
     "\n",
     "In the following tutorial we will demonstrate how the edgeR and PyDESeq2 interfaces can be used to model complex interactions using the triple-negative breast cancer (TNBC) [Zhang dataset](https://www.sciencedirect.com/science/article/pii/S1535610821004992).\n"
    ]
@@ -951,12 +955,12 @@
    "metadata": {},
    "source": [
     "A linear model needs a **design matrix** that encodes, for each sample, the values of the covariates we want to control for or test against.\n",
-    "Pertpy lets you describe the design with a [Wilkinson formula](https://matthewwardrop.github.io/formulaic/latest/formulas/) (the same syntax used by R and patsy).\n",
+    "Pertpy lets you describe the design with a [Wilkinson formula](https://matthewwardrop.github.io/formulaic/latest/formulas/) — the same syntax used by R and patsy.\n",
     "A few examples:\n",
     "\n",
-    "- `~ Treatment` &mdash; intercept plus a coefficient per treatment level (the baseline level is absorbed into the intercept).\n",
-    "- `~ Treatment + Efficacy` &mdash; both covariates as **additive**, independent effects.\n",
-    "- `~ Treatment + Efficacy + Treatment:Efficacy` (equivalently `~ Treatment * Efficacy`) &mdash; additive effects plus an **interaction** term that asks whether the effect of one covariate depends on the level of the other.\n",
+    "- `~ Treatment` — intercept plus a coefficient per treatment level (the baseline level is absorbed into the intercept).\n",
+    "- `~ Treatment + Efficacy` — both covariates as **additive**, independent effects.\n",
+    "- `~ Treatment + Efficacy + Treatment:Efficacy` (equivalently `~ Treatment * Efficacy`) — additive effects plus an **interaction** term that asks whether the effect of one covariate depends on the level of the other.\n",
     "\n",
     "Here we start with the additive model: the type of treatment (`Treatment`) and response to the treatment (`Efficacy`) contribute independently to gene expression.\n",
     "By doing so, we can evaluate:\n",
@@ -1023,7 +1027,8 @@
    "source": [
     "Fitting the model gave us one coefficient per term in the design.\n",
     "To turn those coefficients into a biological comparison we evaluate a **contrast**: a vector with one entry per coefficient that says \"take this combination of fitted effects and test whether it differs from zero\".\n",
-    "For a simple two-group comparison along a single column, `contrast(column=..., baseline=..., group_to_compare=...)` builds the right vector automatically (we will assemble more complex contrasts by hand further down).\n"
+    "For a simple two-group comparison along a single column, {meth}`~pertpy.tools.EdgeR.contrast` builds the right vector automatically.\n",
+    "We will assemble more complex contrasts by hand further down.\n"
    ]
   },
   {
@@ -1224,13 +1229,17 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "The result table has one row per gene. The key columns to look at:\n",
+    "The result table has one row per gene.\n",
+    "The key columns to look at:\n",
     "\n",
-    "- `variable` &mdash; gene symbol.\n",
-    "- `log_fc` &mdash; log2 fold change between the two contrasted groups. Positive values mean the gene is up in `group_to_compare` relative to `baseline`; negative values mean it is down.\n",
-    "- `p_value` &mdash; raw p-value from the test that `log_fc` differs from zero. Useful for diagnostics but **do not** threshold on this directly.\n",
-    "- `adj_p_value` &mdash; p-value after Benjamini–Hochberg adjustment for multiple testing. This is the value to threshold on (commonly `< 0.05`).\n",
-    "- `contrast` &mdash; identifier of the contrast when multiple are tested at once; `None` here because we only ran one contrast.\n",
+    "- `variable` — gene symbol.\n",
+    "- `log_fc` — log2 fold change between the two contrasted groups.\n",
+    "  Positive values mean the gene is up in `group_to_compare` relative to `baseline`; negative values mean it is down.\n",
+    "- `p_value` — raw p-value from the test that `log_fc` differs from zero.\n",
+    "  Useful for diagnostics, but do not threshold on this directly.\n",
+    "- `adj_p_value` — p-value after Benjamini–Hochberg adjustment for multiple testing.\n",
+    "  This is the value to threshold on (commonly `< 0.05`).\n",
+    "- `contrast` — identifier of the contrast when multiple are tested at once; `None` here because we only ran one contrast.\n",
     "\n",
     "A standard summary is \"significantly differentially expressed\" = `adj_p_value < 0.05` **and** `|log_fc|` above some effect-size threshold (the volcano plot below uses `log2fc_thresh` for the latter).\n"
    ]
@@ -1271,9 +1280,10 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Once you have a ranked gene table, a typical next step is **gene-set / pathway enrichment** to turn the per-gene statistics into per-pathway statistics.\n",
-    "[decoupler](https://decoupler-py.readthedocs.io/) integrates well with the result tables produced here &mdash; for example, you can pass the `log_fc` column from `res_df` (indexed by `variable`) into `dc.mt.ulm` together with a prior knowledge network from [OmniPath](https://omnipathdb.org/) (`dc.op.collectri`, `dc.op.progeny`, MSigDB hallmark gene sets, &hellip;) to score transcription-factor or pathway activities.\n",
-    "We do not run an enrichment analysis here to keep this tutorial focused on the DE step, but `decoupler`'s [single-sample functional analysis](https://decoupler-py.readthedocs.io/en/latest/notebooks/bulk.html) tutorial is a good starting point.\n"
+    "Once you have a ranked gene table, a typical next step is **gene-set / pathway enrichment** to turn per-gene statistics into per-pathway statistics.\n",
+    "[decoupler](https://decoupler-py.readthedocs.io/) integrates well with the result tables produced here.\n",
+    "For example, you can pass the `log_fc` column from `res_df` (indexed by `variable`) into `decoupler.mt.ulm` together with a prior-knowledge network from [OmniPath](https://omnipathdb.org/) (`decoupler.op.collectri`, `decoupler.op.progeny`, `decoupler.op.hallmark`, …) to score transcription-factor or pathway activities.\n",
+    "We do not run an enrichment analysis here to keep this tutorial focused on the DE step; decoupler's [bulk functional analysis tutorial](https://decoupler-py.readthedocs.io/en/latest/notebooks/bulk.html) is a good starting point.\n"
    ]
   },
   {

From 5738290ae36c40a052c6a56b4762ec6577da86f6 Mon Sep 17 00:00:00 2001
From: Lukas Heumos <lukas.heumos@posteo.net>
Date: Mon, 25 May 2026 12:07:58 +0200
Subject: [PATCH 3/4] tutorials: swap pertpy/decoupler/mudata URLs for
 intersphinx refs

Sweep across the tutorials replacing readthedocs URLs with MyST
cross-references wherever a populated objects.inv exists for the target.

- differential_gene_expression.ipynb: switch decoupler enrichment
  pointers to {mod}/{func}/{any} refs (decoupler exposes its methods
  as py:data, so `{any}` is used for `decoupler.mt.ulm`).
- distance_tests.ipynb, distances.ipynb,
  expression_prediction_evaluation.ipynb: replace pertpy.tools.Distance
  URLs with {class}`~pertpy.tools.Distance`.
- distances.ipynb, expression_prediction_evaluation.ipynb: replace
  cross-tutorial readthedocs URLs (distance_tests, distances,
  scgen_perturbation_prediction) with relative {doc} refs so they no
  longer go stale when paths/versions move.
- milo.ipynb: MuData URL -> {class}`~mudata.MuData`.

Pages without a clean intersphinx target (perturbation_space anchor,
statsmodels example_formulas page, ete3 project homepage) are left as
plain URL links.

Requires the matching decoupler entry in pertpy's docs/conf.py
intersphinx_mapping (sent in companion pertpy PR).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 differential_gene_expression.ipynb     | 8 ++++----
 distance_tests.ipynb                   | 2 +-
 distances.ipynb                        | 4 ++--
 expression_prediction_evaluation.ipynb | 6 +++---
 milo.ipynb                             | 2 +-
 5 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/differential_gene_expression.ipynb b/differential_gene_expression.ipynb
index 8c2c87d..5858d8e 100644
--- a/differential_gene_expression.ipynb
+++ b/differential_gene_expression.ipynb
@@ -21,7 +21,7 @@
     "- **Pseudobulk + count-based GLMs** ({class}`~pertpy.tools.EdgeR`, {class}`~pertpy.tools.PyDESeq2`) aggregate cells into per-sample pseudobulks and fit a negative-binomial GLM.\n",
     "  This is the recommended approach for multi-sample studies with covariates and the [current best practice](https://www.sc-best-practices.org/conditions/differential_gene_expression.html) for scRNA-seq DE because it controls false positives caused by treating individual cells as independent replicates.\n",
     "  {class}`~pertpy.tools.EdgeR` calls into the R package via `rpy2`; {class}`~pertpy.tools.PyDESeq2` is a pure-Python reimplementation of DESeq2.\n",
-    "- **Generic linear models** ({class}`~pertpy.tools.Statsmodels`) wrap [statsmodels](https://www.statsmodels.org) and expose OLS, robust linear models, and GLMs.\n",
+    "- **Generic linear models** ({class}`~pertpy.tools.Statsmodels`) wrap {mod}`statsmodels` and expose OLS, robust linear models, and GLMs.\n",
     "  Use this when your response variable does not look like a count (for example, log-normalised expression or a continuous score).\n",
     "\n",
     "In the following tutorial we will demonstrate how the edgeR and PyDESeq2 interfaces can be used to model complex interactions using the triple-negative breast cancer (TNBC) [Zhang dataset](https://www.sciencedirect.com/science/article/pii/S1535610821004992).\n"
@@ -1281,9 +1281,9 @@
    "metadata": {},
    "source": [
     "Once you have a ranked gene table, a typical next step is **gene-set / pathway enrichment** to turn per-gene statistics into per-pathway statistics.\n",
-    "[decoupler](https://decoupler-py.readthedocs.io/) integrates well with the result tables produced here.\n",
-    "For example, you can pass the `log_fc` column from `res_df` (indexed by `variable`) into `decoupler.mt.ulm` together with a prior-knowledge network from [OmniPath](https://omnipathdb.org/) (`decoupler.op.collectri`, `decoupler.op.progeny`, `decoupler.op.hallmark`, …) to score transcription-factor or pathway activities.\n",
-    "We do not run an enrichment analysis here to keep this tutorial focused on the DE step; decoupler's [bulk functional analysis tutorial](https://decoupler-py.readthedocs.io/en/latest/notebooks/bulk.html) is a good starting point.\n"
+    "{mod}`decoupler` integrates well with the result tables produced here.\n",
+    "For example, you can pass the `log_fc` column from `res_df` (indexed by `variable`) into {any}`decoupler.mt.ulm` together with a prior-knowledge network from [OmniPath](https://omnipathdb.org/) ({func}`~decoupler.op.collectri`, {func}`~decoupler.op.progeny`, {func}`~decoupler.op.hallmark`, …) to score transcription-factor or pathway activities.\n",
+    "We do not run an enrichment analysis here to keep this tutorial focused on the DE step; decoupler's [bulk functional analysis tutorial](https://decoupler.readthedocs.io/en/latest/notebooks/bulk.html) is a good starting point.\n"
    ]
   },
   {
diff --git a/distance_tests.ipynb b/distance_tests.ipynb
index 23c7ad2..a138ab7 100644
--- a/distance_tests.ipynb
+++ b/distance_tests.ipynb
@@ -11,7 +11,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Pertpy offers several [Distances](https://pertpy.readthedocs.io/en/latest/usage/tools/pertpy.tools.Distance.html#pertpy.tools.Distance) to compute distances between groups of cells.\n",
+    "Pertpy offers several {class}`~pertpy.tools.Distance` distance metrics to compute distances between groups of cells.\n",
     "To determine whether two groups came from the same distribution as evaluated by the Distance metric, pertpy provides Monte-Carlo permutation tests.\n",
     "This can be a valuable tool to assess whether a treatment had a significant effect on the transcript profiles of the cells."
    ]
diff --git a/distances.ipynb b/distances.ipynb
index 8427cd1..0907586 100644
--- a/distances.ipynb
+++ b/distances.ipynb
@@ -20,7 +20,7 @@
     "\n",
     "Other implemented distance functions have different ways of incorporating the distribution of single cells into the distance calculation.\n",
     "\n",
-    "Please refer to [Distance](https://pertpy.readthedocs.io/en/latest/usage/tools/pertpy.tools.Distance.html#pertpy.tools.Distance) for a complete list of available distances."
+    "Please refer to {class}`~pertpy.tools.Distance` for a complete list of available distances."
    ]
   },
   {
@@ -697,7 +697,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "See the [statistical testing notebook](https://pertpy.readthedocs.io/en/latest/tutorials/notebooks/distance_tests.html) for an example."
+    "See the {doc}`./distance_tests` for an example."
    ]
   }
  ],
diff --git a/expression_prediction_evaluation.ipynb b/expression_prediction_evaluation.ipynb
index 8e49051..23dfd0e 100644
--- a/expression_prediction_evaluation.ipynb
+++ b/expression_prediction_evaluation.ipynb
@@ -7,7 +7,7 @@
    "source": [
     "# Evaluation of expression prediction \n",
     "\n",
-    "Various models for prediction of gene expression upon perturbation or across modalities have been proposed, such as scGEN (see [pertpy tutorial](https://pertpy.readthedocs.io/en/latest/tutorials/notebooks/scgen_perturbation_prediction.html)), GEARS, chemCPA, PerturbNet, scPreGAN, and others. They are commonly evaluated with ground-truth data to which predictions are compared with various metrics that capture expression distribution similarity or truthfulness of downstream results based on DE genes or embeddings. To assess how different metrics correspond to data changes, we performed simulations (using SPARSim package) and computed the pertpy metrics on different simulated datasets. Below we first discuss characteristics of alternative metrics that can be used for model evaluation and then show how pertpy may be used to evaluate new model predictions."
+    "Various models for prediction of gene expression upon perturbation or across modalities have been proposed, such as scGEN (see {doc}`./scgen_perturbation_prediction`), GEARS, chemCPA, PerturbNet, scPreGAN, and others. They are commonly evaluated with ground-truth data to which predictions are compared with various metrics that capture expression distribution similarity or truthfulness of downstream results based on DE genes or embeddings. To assess how different metrics correspond to data changes, we performed simulations (using SPARSim package) and computed the pertpy metrics on different simulated datasets. Below we first discuss characteristics of alternative metrics that can be used for model evaluation and then show how pertpy may be used to evaluate new model predictions."
    ]
   },
   {
@@ -29,7 +29,7 @@
     "- We simulated two groups of cells. This can be done by using the Cauchy distribution of LFCs, sampling the LFCs to add upon the parameters of one condition (e.g. ground truth or control) to obtain the second condition (e.g. prediction or perturbed). \n",
     "- We simulated different distances between the two groups. By increasing or decreasing the width of the Cauchy distribution we simulated smaller or larger differences between the two cell groups. Below we show how different metrics behave in the presence of small and large differences.\n",
     "\n",
-    "For a description of the here used metrics please refer to the [pertpy documentation](https://pertpy.readthedocs.io/en/latest/usage/tools/pertpy.tools.Distance.html) and [distance tutorial](https://pertpy.readthedocs.io/en/latest/tutorials/notebooks/distances.html). Lower distance score corresponds to higher similarity."
+    "For a description of the here used metrics please refer to the {class}`~pertpy.tools.Distance` and {doc}`./distances`. Lower distance score corresponds to higher similarity."
    ]
   },
   {
@@ -207,7 +207,7 @@
    "id": "a7a4b42d-89ea-4f2d-a3f6-3bdf811c114d",
    "metadata": {},
    "source": [
-    "For simplicity, we use mock data in a format similar to `eval_adata` from [scGEN tutorial](https://pertpy.readthedocs.io/en/latest/tutorials/notebooks/scgen_perturbation_prediction.html). Briefly, the adata should contain normalized log-transformed expression in X and two obs keys: cell_type and condition (ground truth control and perturbed as well as prediction of the perturbed condition). \n",
+    "For simplicity, we use mock data in a format similar to `eval_adata` from {doc}`./scgen_perturbation_prediction`. Briefly, the adata should contain normalized log-transformed expression in X and two obs keys: cell_type and condition (ground truth control and perturbed as well as prediction of the perturbed condition). \n",
     "\n",
     "The metrics are of relative nature and thus their interpretation relies on comparison across predictions obtained with different modeling settings (models, hyperparameters, training data, etc.) in order to select the best one. Thus, we created multiple predicted groups that correspond to better or worse prediction of the perturbed state. In particular, we generated one group that closely mimics the target, one that has expression shifted towards the input control, and one that has under-estimated variance. Please note that these simulations are over-simplified and we advise you to use a package designed specifically for scRNA-seq expression simulation, such as SPARSim, if you want to further evaluate individual metric performance."
    ]
diff --git a/milo.ipynb b/milo.ipynb
index ad27ed4..132afb2 100644
--- a/milo.ipynb
+++ b/milo.ipynb
@@ -206,7 +206,7 @@
    "id": "productive-growth",
    "metadata": {},
    "source": [
-    "When initializing the Milo object, we create a [MuData](https://mudata.readthedocs.io/en/latest/notebooks/quickstart_mudata.html) object which will store both the gene expression matrices (`rna` view) and cell count matrices used for differential abundance analysis (`milo` view). "
+    "When initializing the Milo object, we create a {class}`~mudata.MuData` object which will store both the gene expression matrices (`rna` view) and cell count matrices used for differential abundance analysis (`milo` view). "
    ]
   },
   {

From 416ce963cdcb58348617f044eb996eeb52ab2ce8 Mon Sep 17 00:00:00 2001
From: Lukas Heumos <lukas.heumos@posteo.net>
Date: Mon, 25 May 2026 12:12:27 +0200
Subject: [PATCH 4/4] de: revert {mod}`decoupler` to plain URL link

The top-level `decoupler` module is not registered as a py:module entry
in decoupler's objects.inv (only the submodules decoupler.mt, .op, etc.
are), so `{mod}\`decoupler\`` would render as a broken target. The four
specific function refs ({any}/{func}\`decoupler.mt.ulm\` etc.) still
resolve through the new intersphinx mapping in the pertpy companion PR.

Verified by parsing https://decoupler.readthedocs.io/en/latest/objects.inv
and checking every {role}\`target\` in the touched notebooks.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 differential_gene_expression.ipynb | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/differential_gene_expression.ipynb b/differential_gene_expression.ipynb
index 5858d8e..2ee48fe 100644
--- a/differential_gene_expression.ipynb
+++ b/differential_gene_expression.ipynb
@@ -1281,7 +1281,7 @@
    "metadata": {},
    "source": [
     "Once you have a ranked gene table, a typical next step is **gene-set / pathway enrichment** to turn per-gene statistics into per-pathway statistics.\n",
-    "{mod}`decoupler` integrates well with the result tables produced here.\n",
+    "[decoupler](https://decoupler.readthedocs.io/) integrates well with the result tables produced here.\n",
     "For example, you can pass the `log_fc` column from `res_df` (indexed by `variable`) into {any}`decoupler.mt.ulm` together with a prior-knowledge network from [OmniPath](https://omnipathdb.org/) ({func}`~decoupler.op.collectri`, {func}`~decoupler.op.progeny`, {func}`~decoupler.op.hallmark`, …) to score transcription-factor or pathway activities.\n",
     "We do not run an enrichment analysis here to keep this tutorial focused on the DE step; decoupler's [bulk functional analysis tutorial](https://decoupler.readthedocs.io/en/latest/notebooks/bulk.html) is a good starting point.\n"
    ]