Fix unchanged-files cache missing changes induced by changed dependencies#8028
Fix unchanged-files cache missing changes induced by changed dependencies#8028SanderMuller wants to merge 1 commit into
Conversation
| return false; | ||
| } | ||
|
|
||
| $this->internalFunctionNames ??= array_flip(get_defined_functions()['internal']); |
There was a problem hiding this comment.
@samsonasik Do you mean this line then? Would static reflection be faster?
There was a problem hiding this comment.
I tried this AI generated get_defined_functions() caching usage on StructArmed, it slow, and use of namespaced_name is more than enough there.
If function check exists needed, there is already phpstan reflection function NativeFunctionReflection for that, so way more reliable per static analysis process instead of runtime php.
There was a problem hiding this comment.
The table is built once per process and memoized, not per call - measured in rector's container: ~0.05ms to build, 10k checks ≈ 1.4ms. Calling get_defined_functions() per node would indeed be slow; this does not.
The namespaced-name comparison answers a different question: "does this resolve to the global namespace", while this guard needs "can this call have a file dependency". Userland functions in the global namespace - app()/config() in Laravel, any composer files-autoloaded helpers - pass that comparison but live in files the cache must track; skipping capture for them would reintroduce the stale-cache bug this PR fixes. The matrix test pins that global userland functions are not treated as native.
NativeFunctionReflection is only known after resolving through the ReflectionProvider (first getFunction() + variants ≈ 24ms, signature map) - the work this guard sits in front of. And the runtime list errs only in the safe direction: a name not in it simply is not skipped, capture proceeds and PHPStan stays authoritative.
There was a problem hiding this comment.
That means the laravelapp() is not properly loaded by phpstan, to make it properly loaded, you may probably needs to load via withBootstrapFiles(), eg:
<?php
use Rector\Config\RectorConfig;
return RectorConfig::configure()
->withBootstrapFiles([__DIR__ . '/bootstrap/app.php'];The reason to use exactly PHPStan reflection as possible is to avoid runtime which inverse with what static analysis purpose.
There was a problem hiding this comment.
The class this thread is on is gone from the PR now — every node goes through PHPStan's DependencyResolver, so function resolution is pure PHPStan reflection.
Your bigger point holds in an important way: the quality of dependency tracking follows the quality of function resolution. I verified the app()-style case with the current branch — a files-autoloaded helper outside the analyzed paths, like vendor helpers: with the project autoloader loaded, PHPStan reflects it with file info, the cache entry records helpers.php as a dependency, and editing only the helper makes the warm run re-report the caller, byte-identical to a fresh run. So composer projects work without withBootstrapFiles(); for functions composer does not load, withBootstrapFiles() is exactly the way to make tracking see them.
When a function is not resolvable at all, PHPStan records no dependency — but it cannot infer anything through it either, so the cached result still matches a fresh run (also verified). Same contract as PHPStan's result cache: edges mirror exactly what PHPStan can see.
Chasing this also surfaced a pre-existing gap: a changed --autoload-file never invalidates the cache (unlike withBootstrapFiles(), which the config hash covers). Fixed separately in #8034.
0960cea to
2347f37
Compare
|
Slimmed this down as discussed: it now contains only the dependency-aware unchanged-files cache fix — 626 lines, half of which is tests. The @samsonasik your |
|
Numbers on a real monorepo, as requested — laravel/framework
Output is byte-identical across all cells, verified per run. Warm ≈ cold on This PR alone is the correctness half. Its cold cost is +4.5% on a precisely interleaved A/B (138-file corpus, 5 rounds), decomposed: the node-type guards and dependency hashing are free, the cost is PHPStan's Proof of concept repo: https://github.com/SanderMuller/rector-stale-cache-poc — reproduces the stale-cache bug in ~30s with vanilla rector 2.4.5 ( One caveat for anyone rerunning this on a rector-src dev checkout: |
|
Pushed f2009eb: codex review caught a gap — PHPStan's |
f2009eb to
1f5c42a
Compare
|
Rewrote this per the feedback: single logic change, one commit, 306 lines. The node-type guards and |
The cache only checked each file's own content, so a clean file stayed skipped on warm runs even when one of its dependencies changed, e.g. a parent class method gaining a return type that lets a child file infer its own. A fresh run reports the new change, a warm run misses it. PHPStanNodeScopeResolver now records each file's dependencies during scope resolution using PHPStan's own DependencyResolver, the same engine behind PHPStan's result cache. Cache entries store the file's own hash plus one hash per dependency, all re-validated on load; legacy string entries self-upgrade on the next write. A failed capture skips caching entirely rather than caching a partial set. Function calls memoize their dependency files per resolved name, as signature dependencies are identical at every call site. Selective runs (--only, --only-suffix) bypass the cache write, same guard as rectorphp#8029. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
1f5c42a to
5403eed
Compare
|
I am closing this in favor of extracted PRs so we can focus on single purpose on single PR: |
Problem
The unchanged-files cache only checks each file's own content. A clean file stays skipped on warm runs even when one of its dependencies changed.
Repro on
main: a parent class method gains: int, which lets a child file infer its own return type. A fresh run reports the new change in the child file, a warm run reports nothing until--clear-cache.Reproducible with vanilla rector in ~30s: https://github.com/SanderMuller/rector-stale-cache-poc
Fix
PHPStanNodeScopeResolverrecords which files each file depends on during scope resolution, using PHPStan's ownDependencyResolver— the same engine behind PHPStan's result cache. No own logic about which nodes matter: every node goes through PHPStan's resolver and its branch chain decides.Cache entries store the file's own hash plus one hash per dependency, all re-validated on load:
Old string entries still load and upgrade themselves on the next write. If capture fails for a file, the file is not cached at all rather than cached with a partial set. Function calls memoize their dependency files per resolved name. Selective runs (
--only,--only-suffix) skip the cache write, same guard as #8029.Cost and scope
~7-8% on a cold run (interleaved A/B, 138-file corpus); warm runs unchanged within noise. Output byte-identical to a fresh run.
Per review feedback this is now the minimal single-logic version: 306 lines, 6 files, one commit, no node-type guards, no
get_defined_functions(), no hardcoded node names — native functions are handled by PHPStan's reflection like everything else. The per-node guards that cut the cold cost to ~4% are split off and can come later as their own perf PR, where that trade-off can be discussed on its own.Follow-ups: #8029 (open), #8033 (diff replay, the 9-170x warm win, stacked on this).