Skip to content

Fix stale JIT dependency caches#637

Open
jhinpan wants to merge 3 commits into
ROCm:mainfrom
jhinpan:fix-issue-453-jit-helper-cache
Open

Fix stale JIT dependency caches#637
jhinpan wants to merge 3 commits into
ROCm:mainfrom
jhinpan:fix-issue-453-jit-helper-cache

Conversation

@jhinpan
Copy link
Copy Markdown
Contributor

@jhinpan jhinpan commented Jun 3, 2026

Restored PR: this is a recreation of #565. The previous PR was closed accidentally when the old head repository was deleted, so GitHub lost the head repo association. This branch is restored at the same old head commit e32d985ce3d5e660615d1a7f585ed6bc1cacb423.


Fixes #453.

Summary

  • Revalidates the JIT definition/dependency cache key in cache-disabled and run-only paths, where iterative development can keep reusing an existing in-process artifact.
  • Clears _call_state_cache, _mem_cache, _last_compiled, and extern linkage tracking when the JIT dependency key changes.
  • Adds a focused compile-only regression test and a small ROCm repro script for the helper-monkeypatch scenario from the issue.

Root cause

_ensure_cache_manager() computed manager_key only once for a JitFunction/owner pair. With FLYDSL_RUNTIME_ENABLE_CACHE=0, the disk cache is disabled, but the same JIT wrapper can still reuse _call_state_cache/_mem_cache entries keyed only by call arguments. If a helper function changes inside the same Python process, the wrapper can return the old compiled artifact without retracing.

Validation

  • bash scripts/build.sh -j64
  • PYTHONPATH=./build-fly/python_packages:. python -m pytest tests/unit/test_jit_dependency_cache.py tests/unit/test_compile_hints.py::TestCacheDisabledRegression tests/unit/test_jit_cache_key.py tests/unit/test_class_bound_jit_kernel.py -q -s -> 9 passed
  • HIP_VISIBLE_DEVICES=0 FLYDSL_RUNTIME_ENABLE_CACHE=0 PYTHONPATH=./build-fly/python_packages:. timeout 180 python scripts/repro_jit_stale_helper_cache.py -> helper change observed during second call; stale reuse no longer occurs
  • bash scripts/check_python_style.sh --install --include-local

Copilot AI review requested due to automatic review settings June 3, 2026 08:20
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds dependency-aware invalidation for in-process JIT artifacts so helper/source changes trigger a retrace (especially when runtime caching is disabled or in run-only mode), and introduces a regression test + a standalone repro script.

Changes:

  • Update JitFunction._ensure_cache_manager() to re-hash dependencies in cache-disabled / run-only modes and clear in-process caches on manager key changes.
  • Add a unit regression test that mutates a helper function and asserts retracing occurs when cache is disabled.
  • Add a script to reproduce stale helper reuse behavior interactively.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
python/flydsl/compiler/jit_function.py Revalidates definition/cache-manager state more aggressively and clears in-process caches when the dependency hash changes.
tests/unit/test_jit_dependency_cache.py New regression test asserting helper source changes invalidate in-process cache when runtime cache is disabled.
scripts/repro_jit_stale_helper_cache.py Standalone repro demonstrating stale in-process JIT reuse after helper mutation and the expected fixed behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1073 to 1091
"""Refresh definition/cache-manager state when the active runtime mode needs it.

The cache-enabled hot path keeps using the existing ``manager_key`` once
a cache manager is initialized. Cache-disabled and run-only modes
re-hash the dependency graph on each call so helper source changes can
invalidate in-process artifacts during iterative development (#453).
"""
run_only = env.runtime.run_only
enable_cache = env.runtime.enable_cache
need_cache = enable_cache or run_only
validate_definition_key = (
self.manager_key is None
or self._manager_owner_cls is not owner_cls
or (need_cache and self.cache_manager is None)
or not enable_cache
or run_only
)
if not validate_definition_key:
return
Comment on lines 1093 to +1094
self._manager_owner_cls = owner_cls
self.manager_key = _jit_function_cache_key(self.func, owner_cls=owner_cls)
manager_key = _jit_function_cache_key(self.func, owner_cls=owner_cls)
Comment on lines +30 to +32
@flyc.jit
def _dependency_launch(stream: fx.Stream = fx.Stream(None)):
_dependency_kernel().launch(grid=(1, 1, 1), block=(1, 1, 1), stream=stream)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

In-process JIT caches can reuse stale compiled artifacts after helper changes

2 participants