Skip to content

Commit 012c1ee

Browse files
Copilotakhanf
andauthored
Add group-level otsu thresholding (all_group_otsu target, group_otsu rule, multiotsu_group rule)"
Agent-Logs-Url: https://github.com/khanlab/SPIMquant/sessions/7b6fb2c4-f7b5-4329-944e-d9191a3c0f96 Co-authored-by: akhanf <11492701+akhanf@users.noreply.github.com>
1 parent b3c2893 commit 012c1ee

7 files changed

Lines changed: 503 additions & 4 deletions

File tree

docs/howto/segmentation.md

Lines changed: 45 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -10,10 +10,12 @@ flowchart LR
1010
B --> C{seg_method}
1111
C -->|threshold| D[Threshold]
1212
C -->|otsu+k3i2| E[Multi-Otsu + k-means]
13-
D --> F[Binary Mask]
14-
E --> F
15-
F --> G[Clean: remove edge / small objects]
16-
G --> H[Segmentation Output]
13+
C -->|groupotsu+k3i2| F[Group Multi-Otsu]
14+
D --> G[Binary Mask]
15+
E --> G
16+
F --> G
17+
G --> H[Clean: remove edge / small objects]
18+
H --> I[Segmentation Output]
1719
```
1820

1921
After segmentation the binary mask is used to compute [field fraction](../reference/outputs.md#field-fraction-map-seg), [object count](../reference/outputs.md#object-count-map-seg), and [per-object region properties](../reference/outputs.md#region-properties-statistics-table-tabular).
@@ -81,6 +83,45 @@ stain_defaults:
8183

8284
**Limitations:** Can fail on images with unusual histograms (e.g. very sparse pathology that does not form a distinct peak) or when the background is very noisy.
8385

86+
### Group Multi-Otsu (`seg_method: groupotsu+k3i2`)
87+
88+
A variant of Multi-Otsu that derives a **single shared threshold from the aggregate histogram of all subjects** rather than computing a threshold independently per image. This is preferred when subjects were acquired with common acquisition settings and you want to ensure consistent, comparable quantification across the cohort.
89+
90+
The workflow is a two-step process:
91+
92+
**Step 1 — compute group threshold** (run once for the whole cohort):
93+
94+
```bash
95+
spimquant /bids /output participant \
96+
--targets all_group_otsu \
97+
--seg_method groupotsu+k3i2
98+
```
99+
100+
This triggers:
101+
102+
1. For each subject: compute a percentile-clipped intensity histogram from the bias-field corrected image and save it as an NPZ file.
103+
2. Aggregate all subject histograms onto a common intensity grid, apply multi-level Otsu thresholding, and save the resulting thresholds as a JSON file in `{output}/group/`.
104+
105+
**Step 2 — segment each subject using the group threshold**:
106+
107+
```bash
108+
spimquant /bids /output participant \
109+
--seg_method groupotsu+k3i2
110+
```
111+
112+
Each subject's binary mask is produced by applying the group-level threshold from the JSON file. A per-subject PNG is also generated showing the group threshold overlaid on the individual histogram, useful for visual quality control.
113+
114+
**Config key:**
115+
116+
```yaml
117+
seg_method:
118+
- groupotsu+k3i2
119+
```
120+
121+
**When to use:** Preferred when a batch of subjects shares the same acquisition protocol and you want consistent thresholding across subjects. Reduces subject-to-subject variability in the segmentation boundary that can occur with per-subject Otsu.
122+
123+
**Limitations:** Less adaptive than per-subject Otsu — if staining intensity varies substantially across subjects (e.g. due to different batches of antibody or tissue preparation), a single group threshold may over- or under-segment some subjects.
124+
84125
---
85126

86127
## Post-Segmentation Cleaning

spimquant/workflow/Snakefile

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -85,6 +85,9 @@ stains_for_seg = list(set(config["stains_for_seg"]).intersection(set(stains)))
8585
# seg methods that use multi-Otsu thresholding (otsu+k{}i{} pattern)
8686
otsu_seg_methods = [m for m in config["seg_method"] if m.startswith("otsu+")]
8787

88+
# seg methods that use group-level multi-Otsu thresholding (groupotsu+k{}i{} pattern)
89+
groupotsu_seg_methods = [m for m in config["seg_method"] if m.startswith("groupotsu+")]
90+
8891
if len(stains_for_seg) == 0 or config["no_segmentation"]:
8992
do_seg = False
9093
do_coloc = False
@@ -908,6 +911,43 @@ rule all_group:
908911
rules.all_group_stats_coloc.input if do_coloc else [],
909912

910913

914+
rule all_group_otsu:
915+
"""Target rule for group-level Otsu threshold computation.
916+
917+
Aggregates intensity histograms across all subjects for each stain and
918+
computes a single set of Otsu thresholds that can be applied consistently
919+
to every subject. Run this rule before participant-level segmentation
920+
when ``groupotsu+k{}i{}`` is included in ``seg_method``.
921+
922+
Example::
923+
924+
spimquant /bids /output participant \\
925+
--targets all_group_otsu \\
926+
--seg_method groupotsu+k3i2
927+
928+
Then run segmentation using the computed group thresholds::
929+
930+
spimquant /bids /output participant \\
931+
--seg_method groupotsu+k3i2
932+
"""
933+
input:
934+
expand(
935+
bids(
936+
root=root,
937+
datatype="group",
938+
stain="{stain}",
939+
level="{level}",
940+
desc="{desc}",
941+
suffix="thresholds.json",
942+
),
943+
stain=stains_for_seg,
944+
level=config["segmentation_level"],
945+
desc=groupotsu_seg_methods,
946+
)
947+
if (do_seg and groupotsu_seg_methods)
948+
else [],
949+
950+
911951
include: "rules/import.smk"
912952
include: "rules/masking.smk"
913953
include: "rules/templatereg.smk"

spimquant/workflow/rules/groupstats.smk

Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,67 @@ Group-level statistical analysis rules for SPIMquant.
44
This module performs group-based statistical tests on segmentation statistics
55
(e.g., fieldfrac, density, volume) across participants, using metadata from
66
participants.tsv to define contrasts.
7+
8+
It also provides the ``group_otsu`` rule which aggregates per-subject intensity
9+
histograms (produced by ``compute_subject_histogram``) to compute a single set
10+
of Otsu thresholds shared across all subjects.
711
"""
812

913

14+
rule group_otsu:
15+
"""Compute group-level Otsu thresholds from aggregated per-subject histograms.
16+
17+
Collects the intensity histogram NPZ files computed by
18+
``compute_subject_histogram`` for all subjects, merges them onto a
19+
common intensity grid, and applies multi-level Otsu thresholding to the
20+
aggregate histogram. The resulting thresholds are saved as a JSON file
21+
(consumed by ``multiotsu_group`` during participant-level segmentation)
22+
and as a PNG figure for visual inspection.
23+
24+
This rule is the target of ``all_group_otsu`` and should be run before
25+
participant-level segmentation when ``groupotsu+k{}i{}`` is used as the
26+
segmentation method.
27+
"""
28+
input:
29+
histogram_npz=lambda wildcards: inputs["spim"].expand(
30+
bids(
31+
root=work,
32+
datatype="seg",
33+
stain=wildcards.stain,
34+
level=wildcards.level,
35+
desc="groupotsu+k{k}i{i}".format(k=wildcards.k, i=wildcards.i),
36+
suffix="histogram.npz",
37+
**inputs["spim"].wildcards,
38+
)
39+
),
40+
params:
41+
otsu_k=lambda wildcards: int(wildcards.k),
42+
otsu_threshold_index=lambda wildcards: int(wildcards.i),
43+
output:
44+
thresholds_json=bids(
45+
root=root,
46+
datatype="group",
47+
stain="{stain}",
48+
level="{level}",
49+
desc="groupotsu+k{k,[0-9]+}i{i,[0-9]+}",
50+
suffix="thresholds.json",
51+
),
52+
thresholds_png=bids(
53+
root=root,
54+
datatype="group",
55+
stain="{stain}",
56+
level="{level}",
57+
desc="groupotsu+k{k,[0-9]+}i{i,[0-9]+}",
58+
suffix="thresholds.png",
59+
),
60+
threads: 4
61+
resources:
62+
mem_mb=8000,
63+
runtime=10,
64+
script:
65+
"../scripts/group_otsu.py"
66+
67+
1068
rule perform_group_stats:
1169
"""Perform group-based statistical tests on segmentation statistics.
1270

spimquant/workflow/rules/segmentation.smk

Lines changed: 103 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -152,6 +152,109 @@ rule multiotsu:
152152
"../scripts/multiotsu.py"
153153

154154

155+
rule compute_subject_histogram:
156+
"""Compute intensity histogram for a single subject for group-level Otsu thresholding.
157+
158+
Calculates a percentile-clipped histogram of the bias-field corrected image
159+
and saves it as an NPZ file (hist_counts + bin_edges). These per-subject
160+
histogram files are later aggregated by the ``group_otsu`` rule to derive
161+
a single set of thresholds that can be applied consistently across the whole
162+
cohort with the ``multiotsu_group`` rule.
163+
"""
164+
input:
165+
corrected=bids(
166+
root=work,
167+
datatype="seg",
168+
stain="{stain}",
169+
level="{level}",
170+
desc="corrected{method}".format(method=config["correction_method"]),
171+
suffix="SPIM.ome.zarr",
172+
**inputs["spim"].wildcards,
173+
),
174+
params:
175+
hist_bin_width=float(config["seg_hist_bin_width"]),
176+
hist_percentile_range=[float(x) for x in config["seg_hist_percentile_range"]],
177+
zarrnii_kwargs={"orientation": config["orientation"]},
178+
output:
179+
histogram_npz=bids(
180+
root=work,
181+
datatype="seg",
182+
stain="{stain}",
183+
level="{level}",
184+
desc="groupotsu+k{k,[0-9]+}i{i,[0-9]+}",
185+
suffix="histogram.npz",
186+
**inputs["spim"].wildcards,
187+
),
188+
threads: 128 if config["dask_scheduler"] == "distributed" else 32
189+
resources:
190+
mem_mb=500000 if config["dask_scheduler"] == "distributed" else 250000,
191+
runtime=90,
192+
script:
193+
"../scripts/compute_subject_histogram.py"
194+
195+
196+
rule multiotsu_group:
197+
"""Apply group-level Otsu threshold for segmentation.
198+
199+
Uses a pre-computed group-level Otsu threshold (derived from an aggregate
200+
histogram across all subjects) to create a binary mask for the current
201+
subject. This ensures consistent thresholding across the whole cohort.
202+
A per-subject PNG is also produced showing the group threshold overlaid
203+
on this subject's own intensity histogram for visual quality control.
204+
205+
Run ``all_group_otsu`` before using this rule so that the group threshold
206+
JSON file is available.
207+
"""
208+
input:
209+
corrected=bids(
210+
root=work,
211+
datatype="seg",
212+
stain="{stain}",
213+
level="{level}",
214+
desc="corrected{method}".format(method=config["correction_method"]),
215+
suffix="SPIM.ome.zarr",
216+
**inputs["spim"].wildcards,
217+
),
218+
thresholds_json=bids(
219+
root=root,
220+
datatype="group",
221+
stain="{stain}",
222+
level="{level}",
223+
desc="groupotsu+k{k,[0-9]+}i{i,[0-9]+}",
224+
suffix="thresholds.json",
225+
),
226+
params:
227+
hist_bin_width=float(config["seg_hist_bin_width"]),
228+
hist_percentile_range=[float(x) for x in config["seg_hist_percentile_range"]],
229+
zarrnii_kwargs={"orientation": config["orientation"]},
230+
output:
231+
mask=bids(
232+
root=root,
233+
datatype="seg",
234+
stain="{stain}",
235+
level="{level}",
236+
desc="groupotsu+k{k,[0-9]+}i{i,[0-9]+}",
237+
suffix="mask.ozx",
238+
**inputs["spim"].wildcards,
239+
),
240+
thresholds_png=bids(
241+
root=root,
242+
datatype="seg",
243+
stain="{stain}",
244+
level="{level}",
245+
desc="groupotsu+k{k,[0-9]+}i{i,[0-9]+}",
246+
suffix="thresholds.png",
247+
**inputs["spim"].wildcards,
248+
),
249+
threads: 128 if config["dask_scheduler"] == "distributed" else 32
250+
resources:
251+
mem_mb=500000 if config["dask_scheduler"] == "distributed" else 250000,
252+
disk_mb=2097152,
253+
runtime=180,
254+
script:
255+
"../scripts/multiotsu_group.py"
256+
257+
155258
rule threshold:
156259
"""Apply simple intensity threshold for segmentation.
157260
Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
"""Compute and save an intensity histogram for a single subject.
2+
3+
Used as the first step of group-level Otsu thresholding. Each subject
4+
independently computes its histogram from the bias-field corrected image;
5+
the resulting per-subject NPZ files are later aggregated by the
6+
``group_otsu`` rule to derive a single set of thresholds shared across the
7+
whole cohort.
8+
9+
This is a Snakemake script; the ``snakemake`` object is automatically
10+
provided when executed as part of a Snakemake workflow.
11+
"""
12+
13+
import numpy as np
14+
15+
from dask_setup import get_dask_client
16+
from zarrnii import ZarrNii
17+
18+
if __name__ == "__main__":
19+
with get_dask_client(snakemake.config["dask_scheduler"], snakemake.threads):
20+
21+
zarrnii_kwargs = snakemake.params.zarrnii_kwargs
22+
pct_lo, pct_hi = snakemake.params.hist_percentile_range
23+
bin_width = snakemake.params.hist_bin_width
24+
25+
# load a downsampled version to estimate the percentile-based range
26+
print("estimating intensity range from downsampled image...")
27+
znimg_ds = None
28+
for ds_level in [5, 4, 3, 2, 1]:
29+
try:
30+
candidate = ZarrNii.from_ome_zarr(
31+
snakemake.input.corrected, level=ds_level, **zarrnii_kwargs
32+
)
33+
znimg_ds = candidate
34+
break
35+
except Exception:
36+
pass
37+
38+
if znimg_ds is None:
39+
znimg_ds = ZarrNii.from_ome_zarr(
40+
snakemake.input.corrected, **zarrnii_kwargs
41+
)
42+
43+
data_ds = znimg_ds.data.compute().ravel().astype(np.float32)
44+
range_lo = float(np.percentile(data_ds, pct_lo))
45+
range_hi = float(np.percentile(data_ds, pct_hi))
46+
print(
47+
f" 📊 percentile range [{pct_lo}%, {pct_hi}%]: [{range_lo:.3f}, {range_hi:.3f}]"
48+
)
49+
50+
# compute number of bins from bin width
51+
n_bins = max(2, int(np.ceil((range_hi - range_lo) / bin_width)))
52+
print(f" 📊 bins: {n_bins} (bin width: {bin_width})")
53+
54+
# compute full-resolution histogram
55+
znimg = ZarrNii.from_ome_zarr(snakemake.input.corrected, **zarrnii_kwargs)
56+
(hist_counts, bin_edges) = znimg.compute_histogram(
57+
bins=n_bins, range=[range_lo, range_hi]
58+
)
59+
60+
print(f"saving histogram to {snakemake.output.histogram_npz}")
61+
np.savez(
62+
snakemake.output.histogram_npz,
63+
hist_counts=hist_counts,
64+
bin_edges=bin_edges,
65+
)

0 commit comments

Comments
 (0)