Stripe size-sorted files across parallel jobs by SanderMuller · Pull Request #5844 · phpstan/phpstan-src

SanderMuller · 2026-06-11T07:55:06Z

What & why

The scheduler chunks files into jobs in path order. Directories tend to cluster files of
similar weight, so one job can collect a directory's heaviest files and become the
straggler the whole run waits for while other workers sit idle.

This change sorts the files by size and deals them round-robin across the jobs, so every
job gets the same size profile. Job count and the per-job ceiling (jobSize) stay the
same; only the composition changes. File sizes are supplied from the outside
(callable(string): int), so the Scheduler itself stays unit-testable without touching
the filesystem.

Benchmarks

Complete make phpstan on phpstan-src (clears the result cache itself, so every run is
a full analysis of 2,364 files), two clean worktrees with primed file caches,
hyperfine --warmup 1 --runs 6, Apple M4 Pro / PHP 8.5.7:

Command	Mean [s]	Min [s]	Max [s]	Relative
base `ff2647a` — `make phpstan`	19.047 ± 0.100	18.921	19.215	1.08 ± 0.01
PR — `make phpstan`	17.573 ± 0.068	17.454	17.635	1.00

−1.47 s, 1.08× faster; user CPU unchanged within noise (111.3 s vs 113.3 s) — the win is
load balance, not less work. A reversed-order pass reproduces it (17.77 ± 0.24 s vs
19.04 ± 0.23 s), so it is not an ordering or thermal artifact.

One detail worth knowing: sorting without striping makes things considerably worse
(+46% wall on a smaller corpus), because plain chunking of a sorted list concentrates all
heavy files into the first job. The striping is what makes the sort pay off, and
testHeaviestFilesAreSpreadAcrossJobs pins exactly that property.

File size is a proxy for analysis cost, not a perfect predictor, but across repeated runs
it consistently beat path-order chunking and never lost to it.

Tests

make tests (12,713, green), make phpstan, make cs, make lint and
make composer-dependency-analyser all pass. Analysis output is byte-identical to the
baseline; only the distribution of files over worker jobs changes. SchedulerTest covers
the striped layout exactly (six distinctly-sized files: the three heaviest land in three
different jobs) and that every file is scheduled exactly once with jobs ≤ jobSize.

🤖 Generated with Claude Code

ondrejmirtes

I like this change, I wondered about it several times myself.

ondrejmirtes · 2026-06-11T08:14:07Z

 				124,
 				2,
-				[30, 30, 30, 30, 4],
+				[25, 25, 25, 25, 24],


I think the tests could be more expressive here. Could we supply differently sized files here, or maybe change the Scheduler signature so that we provide file sizes from the outside and the class is still unit testable?

Done — scheduleWork() now takes a callable(string): int for the file sizes, so the class is unit-testable without touching the filesystem; both production callers pass fn ($file) => (int) @filesize($file). Added two expressive tests with synthetic sizes: one asserts the exact striped layout for six distinctly-sized files (the three heaviest files end up in three different jobs, each paired with a light file), the other checks every file is scheduled exactly once and no job exceeds jobSize for a larger irregular set. The existing count-based dataset tests now pass a constant-size callback.

staabm · 2026-06-11T08:50:24Z

+		$numberOfJobs = (int) ceil(count($files) / $this->jobSize);
+		$stripedJobs = [];
+		foreach ($files as $i => $file) {
+			$stripedJobs[$i % $numberOfJobs][] = $file;
+		}
+
+		$jobs = array_values($stripedJobs);


could this last block be simplified to $jobs = array_chunk($files, $this->jobSize); again?

It can't, unfortunately — that's the one shape I measured as actively harmful. With the list sorted by size, array_chunk() puts the ~20 heaviest files together in the first job, and whichever worker picks it up becomes a straggler the whole run waits for: sort + chunk measured +46% cold wall time vs. unsorted chunking on the src/Type self-analysis benchmark, while sort + round-robin striping is what produces the −21..23%. The striping is the point, not an artifact; the new testHeaviestFilesAreSpreadAcrossJobs test pins exactly this property.

Files were chunked into jobs in path order, so a job could end up with all of a directory's heaviest files and become the straggler the whole run waits for. Sorting by file size and dealing files round-robin across jobs gives every job the same size profile: -21..23% cold wall time on a 14-core machine, CPU unchanged. Sorting without striping makes it worse (+46% wall - the heavy files all land in the first job), which is why the striping is essential rather than cosmetic. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

ondrejmirtes · 2026-06-11T09:44:50Z

So what are hyperfine numbers for this one when running complete make phpstan on phpstan-src?

SanderMuller · 2026-06-11T10:18:25Z

Complete make phpstan on phpstan-src (which clears the result cache itself, so every run is a full analysis of the 2,364 files), two clean worktrees with primed file caches, Apple M4 Pro / PHP 8.5.7:

Command	Mean [s]	Min [s]	Max [s]	Relative
base `ff2647a` — `make phpstan`	19.047 ± 0.100	18.921	19.215	1.08 ± 0.01
PR — `make phpstan`	17.573 ± 0.068	17.454	17.635	1.00

−1.47 s, 1.08× faster (hyperfine --warmup 1 --runs 6). User CPU is unchanged within noise (111.3 s vs 113.3 s) — the win is purely better load balance across the workers, not less work. A reversed-order pass (PR first, then base, 3 runs each) reproduces it: 17.77 ± 0.24 s vs 19.04 ± 0.23 s, so it's not an ordering or thermal artifact.

ondrejmirtes · 2026-06-11T10:35:39Z

Thank you!

staabm · 2026-06-12T05:59:47Z

looks like this PR regressed a e2e test:
https://github.com/phpstan/phpstan-src/actions/runs/27340923857/job/80777219668

SanderMuller · 2026-06-12T12:10:02Z

Good catch, you're right — and I now understand why I misdiagnosed it earlier as environmental: on my Mac the e2e fails on the base branch too (APFS resolves the wrong-case PSR-4 path Platforms/SqlitePlatform.php, so class.nameCase fires regardless of this PR), which made me think it wasn't related. On Linux that lookup correctly misses, and what actually decides the outcome there is the analysis order inside the job: MemoizingReflector keys are lowercased, so once src/Persistence/Sql/DbalDriverMiddleware.php resolves SQLitePlatform (the DBAL 4 name), a later lookup of SqlitePlatform from bootstrap-types.php hits that memo and reports class.nameCase instead of the expected class.notFound. The size sort put the 560-byte middleware before the 327-byte bootstrap file; FileFinder order has them the other way around.

Fixed by restoring the original input order within each job after striping — the size sort now only decides which job a file lands in, never the order files are analysed in. That's also the safer contract in general, since results can apparently be order-sensitive. testFilesWithinAJobKeepTheirInputOrder pins exactly the bug-14036 shape, and make phpstan numbers are unchanged (17.73 ± 0.04 s vs 19.00 ± 0.06 s base, 1.07×).

ondrejmirtes requested changes Jun 11, 2026

View reviewed changes

staabm reviewed Jun 11, 2026

View reviewed changes

SanderMuller force-pushed the perf/striped-job-scheduling branch from eede67f to b8e6fde Compare June 11, 2026 09:41

ondrejmirtes approved these changes Jun 11, 2026

View reviewed changes

ondrejmirtes merged commit cb28849 into phpstan:2.2.x Jun 11, 2026
665 of 672 checks passed

SanderMuller mentioned this pull request Jun 11, 2026

Store file cache entries with serialize() instead of var_export/include #5845

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stripe size-sorted files across parallel jobs#5844

Stripe size-sorted files across parallel jobs#5844
ondrejmirtes merged 1 commit into
phpstan:2.2.xfrom
SanderMuller:perf/striped-job-scheduling

SanderMuller commented Jun 11, 2026 •

edited

Loading

Uh oh!

ondrejmirtes left a comment

Uh oh!

ondrejmirtes Jun 11, 2026

Uh oh!

SanderMuller Jun 11, 2026

Uh oh!

staabm Jun 11, 2026

Uh oh!

SanderMuller Jun 11, 2026

Uh oh!

ondrejmirtes commented Jun 11, 2026

Uh oh!

SanderMuller commented Jun 11, 2026

Uh oh!

Uh oh!

ondrejmirtes commented Jun 11, 2026

Uh oh!

staabm commented Jun 12, 2026

Uh oh!

SanderMuller commented Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

SanderMuller commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What & why

Benchmarks

Tests

Uh oh!

ondrejmirtes left a comment

Choose a reason for hiding this comment

Uh oh!

ondrejmirtes Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

SanderMuller Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

staabm Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

SanderMuller Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

ondrejmirtes commented Jun 11, 2026

Uh oh!

SanderMuller commented Jun 11, 2026

Uh oh!

Uh oh!

ondrejmirtes commented Jun 11, 2026

Uh oh!

staabm commented Jun 12, 2026

Uh oh!

SanderMuller commented Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

SanderMuller commented Jun 11, 2026 •

edited

Loading