add MiloDE#1001
Merged
Merged
Conversation
For each nhood, pseudobulks cells by sample (sc.get.aggregate), fits the existing PyDESeq2 or Statsmodels DE model, and applies the two-axis correction miloDE uses: BH across genes within each nhood and density-weighted BH across nhoods per gene. The weighted-BH used by da_nhoods is factored out into a shared _weighted_bh helper. Validated against miloDE R on synthetic data: median Spearman(logFC) = 0.996 across genes per nhood, 100% sign-concordance on R-significant entries. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Switch return type from AnnData(n_nhoods × n_genes) to a long DataFrame with columns (nhood, variable, log_fc, p_value, adj_p_value, pval_corrected_across_nhoods, test_performed) — same shape pertpy's other DE methods produce and the same shape miloDE R returns. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pairs with de_nhoods the same way plot_nhood_graph pairs with da_nhoods: takes the long DataFrame + a gene name and renders the nhood graph colored by that gene's logFC, masking nhoods above the spatial-FDR threshold. The rendering body shared with plot_nhood_graph is factored into a private _render_nhood_graph helper to avoid duplication. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pulls in scverse/pertpy-tutorials#61 which appends a Milo.de_nhoods + plot_de_nhood_graph demonstration to milo.ipynb. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PyDESeq2 prints progress per fit; calling it once per nhood otherwise floods stdout. Wrap the per-nhood model fit/test in contextlib.redirect_stdout so de_nhoods runs cleanly. Re-points the tutorials submodule to scverse/pertpy-tutorials@cffc9b5 which adds the miloDE section to milo.ipynb, with warnings filtered at the top of the notebook so per-nhood validity messages and small-sample PyDESeq2 warnings don't drown out cell outputs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
scverse/pertpy-tutorials@9ff43c2 switches the miloDE tutorial section to kang_2018 (8 patients × ctrl/stim, raw counts in .X) so the example recovers ISG15 / ISG20 as top hits rather than mostly-NaN results. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two related fixes plus a tutorials submodule bump. de_nhoods previously stored p-values as float32, which underflows to zero below ~1.4e-45. Strong DE signals (IFN-β ISGs in the kang_2018 example) routinely produce p ~ 1e-200 from PyDESeq2, which was getting silently clamped to 0 in both the raw p_value and adj_p_value columns (and hence pval_corrected_across_nhoods too). Storing p-values in float64 fixes it; logFC stays float32. Also collapses the per-nhood `Nhood X: DE test failed (...)` logger warnings (one per skipped nhood) into a single end-of-run summary, so the tutorial doesn't have to reach into pertpy._logger to silence them. Tutorials submodule bumps to scverse/pertpy-tutorials@a0b9e84, which drops the pertpy._logger import from milo.ipynb and re-executes against this fix (top hits now show p_value ~ 1e-73, not 0). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The miloDE tutorial section landed via scverse/pertpy-tutorials#61 (squashed to d2ab4bf on main); move the pinned commit from the feature-branch tip to the merged main commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Matches the tutorial example. The synthetic-data version was less informative. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #635.
Implements miloDE-style per-neighbourhood differential expression testing as
Milo.de_nhoods. Pure Python, no edgeR.API
milo.plot_de_nhood_graph(mdata, de, gene="ISG15")on kang_2018 (canonical IFN-β response gene; expected strong positive logFC in many nhoods):What it does
For each nhood: pseudobulks cells by sample via
sc.get.aggregate, fits the existing pertpy DE model (PyDESeq2orStatsmodels), and applies the two-axis correction miloDE uses — BH across genes within each nhood, density-weighted BH across nhoods per gene (same correctionda_nhoodsuses, now factored into a shared_weighted_bhhelper). Nhoods that fail validity (too few samples per condition, rank-deficient design) are skipped and flagged intest_performed; one summary log line is emitted at the end rather than one per skipped nhood.plot_de_nhood_graphis the DE-side counterpart ofplot_nhood_graph: takes the long DataFrame plus a gene name and renders the nhood graph colored by that gene's logFC, masking nhoods withpval_corrected_across_nhoods > alpha. The shared rendering body is factored into a private helper used by both.Validation against miloDE R
Ran the R package on the same synthetic data + nhood structure (71 nhoods × 200 genes, planted DE in one population):
test_performed(R / Py / both)Remaining differences are NB-GLM Wald (PyDESeq2) vs edgeR's QLF on small samples.