perf: replay cached dry-run diffs for unchanged files, 9-150x faster warm dry-runs#8033
perf: replay cached dry-run diffs for unchanged files, 9-150x faster warm dry-runs#8033SanderMuller wants to merge 3 commits into
Conversation
| return false; | ||
| } | ||
|
|
||
| $this->internalFunctionNames ??= array_flip(get_defined_functions()['internal']); |
There was a problem hiding this comment.
Again, please don't do this.
Correctness is more important than speed. Doing this may cause inconsistent result for developers due to runtime check per available extensions in their local dev.
Use phpstan reflection instead, that's the way make it reliable.
There was a problem hiding this comment.
Applied — NativeFunctionCallAnalyzer is removed from the stack entirely (here and in #8028). Every node now goes through PHPStan's DependencyResolver, so native functions are resolved by PHPStan reflection like everything else. The skip-layer trade-off (~4% vs ~7-8% cold) can be a separate discussion if it ever comes back as its own PR.
d508860 to
7a94fea
Compare
| $resolvedName = $node->name->getAttribute('resolvedName'); | ||
| $nameForMemoKey = $resolvedName instanceof Name ? $resolvedName : $node->name; | ||
| $functionMemoKey = $mutatingScope->getNamespace() . '|' . strtolower($nameForMemoKey->toCodeString()); |
There was a problem hiding this comment.
Use $this->nodeNameResolver->getName($node) instead.
There was a problem hiding this comment.
Applied — the memo key now uses $this->nodeNameResolver->getName($node) (lowercased, as function names are case-insensitive). Measured cost: none (interleaved cold A/B within noise), and verified the namespace-fallback case end to end: two namespaces calling the same unqualified helper() with different resolutions record distinct dependency edges, and editing one helper re-reports only its caller, byte-identical to a fresh run.
The cache only checked each file's own content, so a clean file stayed skipped on warm runs even when one of its dependencies changed, e.g. a parent class method gaining a return type that lets a child file infer its own. A fresh run reports the new change, a warm run misses it. PHPStanNodeScopeResolver now records each file's dependencies during scope resolution using PHPStan's own DependencyResolver, the same engine behind PHPStan's result cache. Cache entries store the file's own hash plus one hash per dependency, all re-validated on load; legacy string entries self-upgrade on the next write. A failed capture skips caching entirely rather than caching a partial set. Function calls memoize their dependency files per resolved name, as signature dependencies are identical at every call site. Selective runs (--only, --only-suffix) bypass the cache write, same guard as rectorphp#8029. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Files with a pending diff are never marked clean in dry-run mode, the diff must keep being reported, so every warm dry-run reprocessed them from scratch. On a 4,400-file project with 37 pending diffs that was ~11s per run. Cache the FileDiff with the file's own hash plus one hash per captured dependency; when all still match, replay the cached diff instead of reprocessing, skipping scope resolution entirely. Dry-run only: write mode always computes fresh. --no-diffs results never cross into normal entries, and the original hasChanged flag is replayed, as a rule can report line changes while printing identical content. Warm dry-run on the same project: ~9x faster single process, ~3.5x parallel. Output stays byte-identical in every cache state. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
SimpleParameterProvider::hash() serializes the whole parameter bag and contentHash() runs per file, so a warm run paid the serialization once per file (~46ms per 3,200 calls with a 300-entry skip list). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
7a94fea to
bd25a20
Compare
Built on #8028 (its commit is first; review the last two here). Opening now so the numbers discussion has code attached — happy to rebase once #8028 lands.
Problem
Files with a pending diff are never marked clean in dry-run mode — correct, the diff must keep being reported — so every warm dry-run reprocesses them from scratch: parse, full PHPStan scope resolution, every rule. On real projects most warm time is exactly this. laravel/framework
src/Illuminatewith the prepared sets has 1,526 pending diffs: a warm dry-run costs the same as a cold one (220s vs 240s single process).Change
Cache the produced
FileDiffkeyed on the file's own content hash, the parameter hash and one content hash per captured dependency (the capture from #8028). When everything still matches on the next run, replay the cached diff instead of reprocessing the file — skipping scope resolution entirely. Gist:Dry-run only: write mode always computes fresh. Selective runs (
--only,--only-suffix) bypass the cache entirely.--no-diffsresults never cross into normal entries. The originalhasChangedflag is replayed, since a rule can report line changes while printing identical content. A failed dependency capture means the file is never cached. The parameter hash is memoized per process, as computing it serializes the whole parameter bag and the cache key needs it per file.Numbers
Output byte-identical to a fresh run in every cache state, verified per measurement; numbers re-measured on the current minimal #8028 base. Cold cost is the #8028 capture (~7-8% interleaved in its minimal form); the replay itself adds nothing measurable on top. The warm gain scales with how many pending diffs the project has; a fully clean project sees no change.
Verification
Invalidation is covered end-to-end in tests: own-content change, dependency change (fresh-process simulation),
--no-diffscross-replay, thehasChangedflag round-trip. Replay works in parallel mode (workers save, workers replay).