Skip to content

Commit b9bc34c

Browse files
author
Henry Wallace
committed
Make differentiable Spearman a default dep (diffsort); remove sorting extra
- pyproject: add diffsort>=0.1.0 to dependencies, remove [sorting] optional - README/Justfile/MAKING_IT_GOOD: no longer mention --extra sorting
1 parent 0d0a8a4 commit b9bc34c

4 files changed

Lines changed: 8 additions & 11 deletions

File tree

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ ICF is normalized to \([0, 1]\): **0.0 = very common**, **1.0 = very rare**.
88

99
```bash
1010
uv sync --extra dev
11-
# Recommended for multi-task training: uv sync --extra sorting (torchsort or diffsort for differentiable Spearman; backend is logged at train start)
11+
# Differentiable Spearman (diffsort) is a default dependency; backend logged at train start
1212

1313
# Train
1414
uv run tiny-icf-train --help

docs/guides/MAKING_IT_GOOD_MINIMAL_HEURISTICS.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ Research-backed, low-heuristic improvements. No hand-picked anchor words or ad-h
2020

2121
**Fix:** Use **differentiable Spearman** via soft sorting (Blondel et al., "Fast Differentiable Sorting and Ranking", ICML 2020; [arxiv 2002.08871](https://arxiv.org/abs/2002.08871)). Loss = \( \frac{1}{2}\|r - r_\Psi(\theta)\|^2 \) where \( r_\Psi \) are soft ranks. Implementations: **torchsort** (O(n log n), recommended), **diffsort** (O(n²(log n)²)).
2222

23-
**Implemented:** `loss_unified.spearman_loss_tensor` with `spearman_method="auto"` (default): use **torchsort** if available, else **diffsort**, else rank_relax or built-in soft_rank. All paths are differentiable. Install `uv sync --extra sorting` for torchsort and/or diffsort. At training start we log `Spearman loss backend: <torchsort|diffsort|rank_relax|built-in>`. CLI: `--spearman-reg-strength 0.1`, `--spearman-method auto|torchsort|diffsort|sigmoid`.
23+
**Implemented:** `loss_unified.spearman_loss_tensor` with `spearman_method="auto"` (default): use **torchsort** if available, else **diffsort** (default dependency), else rank_relax or built-in soft_rank. All paths are differentiable. At training start we log `Spearman loss backend: <torchsort|diffsort|rank_relax|built-in>`. CLI: `--spearman-reg-strength 0.1`, `--spearman-method auto|torchsort|diffsort|sigmoid`.
2424

2525
---
2626

justfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,7 @@ sync-s3:
6464
aws s3 sync models/ s3://arclabs-backups/tiny-icf/models/ --exclude "*" --include "multitask_*.pt" --include "v3_base*.pt" --include "*.pt.cal.json"
6565

6666
# English-only training (better "the"/"and", no lang prefix); uses frequency sampling + spearman-method auto
67-
# For differentiable Spearman: uv sync --extra sorting (torchsort or diffsort; backend logged at start)
67+
# Differentiable Spearman (diffsort by default; torchsort if installed); backend logged at start
6868
# For custom EPOCHS/SAMPLES run: uv run python scripts/train_all_fronts.py ... --epochs N --train-max-samples M
6969
train-en DATA="data/word_frequency.csv" EPOCHS="30" SAMPLES="200000":
7070
mkdir -p models/all_fronts_en

pyproject.toml

Lines changed: 5 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -10,11 +10,12 @@ dependencies = [
1010
"numpy>=1.24.0",
1111
"pandas>=2.0.0",
1212
"tqdm>=4.65.0",
13-
"scipy>=1.10.0", # For correlation metrics
14-
"lightning>=2.0.0", # PyTorch Lightning for non-interactive training
15-
"aim>=3.29.0", # Experiment tracking
16-
"requests>=2.31.0", # For dataset downloading
13+
"scipy>=1.10.0", # For correlation metrics
14+
"lightning>=2.0.0", # PyTorch Lightning for non-interactive training
15+
"aim>=3.29.0", # Experiment tracking
16+
"requests>=2.31.0", # For dataset downloading
1717
"wordfreq>=3.1.1",
18+
"diffsort>=0.1.0", # Differentiable Spearman (sorting networks); torchsort used if installed
1819
]
1920

2021
[project.optional-dependencies]
@@ -31,10 +32,6 @@ dependencies = [
3132
"sentence-transformers>=2.2.0", # Lightweight teacher models
3233
"transformers>=4.30.0", # For BERT/RoBERTa teacher models (optional)
3334
]
34-
sorting = [
35-
"diffsort>=0.1.0", # Differentiable sorting networks (ICLR 2022)
36-
"torchsort>=0.1.6", # Fast differentiable sorting (O(n log n), recommended)
37-
]
3835

3936
[project.scripts]
4037
tiny-icf-train = "tiny_icf.train:main"

0 commit comments

Comments
 (0)