Vine: Add caching support for selective task re-execution#4376
Vine: Add caching support for selective task re-execution#4376talha129 wants to merge 4 commits intocooperative-computing-lab:masterfrom
Conversation
|
Please add a test case that exercises the new feature in |
|
It looks like your tests ran, all except the newly added one. |
|
@dthain thanks for pointing it out. I have made it executable. I'm still unable to run complete test suite locally (7 failing) even though the new test file work fine. Im running it on MacOS so wanted to know if there is any previously logged issue about this. |
btovar
left a comment
There was a problem hiding this comment.
Just some minor comments.
I'd suggest moving any if self._tasks_cache: ...code... to their own functions. Inside these functions the first lines should be if not self._tasks_cache:\n return. I think that will make it clear what is part of the cache and what of the regular taskvine task management.
| self._update_status_display() | ||
|
|
||
| # Drain cached queue before blocking on C runtime. | ||
| if self._task_cache and self._cached_queue: |
There was a problem hiding this comment.
same as above, let's isolate to a method wait_for_tag_with_cache or something like that.
…cache and TaskCache to _tasks_cache and TasksCache
|
@btovar I have moved all caching logic to it's isolated functions and also made other changes as requested. |
Proposed Changes
This change is motivated by the caching requirements described in "Efficiently Reproducing Distributed Workflows in Notebook-based Systems" (Azaz et al., arXiv:2603.26965), as suggested by Dr. Douglas Thain.
Task Result Caching (enable_tasks_cache)
Adds opt-in memoization of Task results via Manager.enable_tasks_cache(). On first execution, task outputs are fingerprinted and stored to a local cache directory. On subsequent submissions with identical function and arguments, results are returned directly from cache without dispatching to a worker. Cache state persists across manager restarts via a JSON transaction log. A new test TR_vine_python_cache.sh verifies first-run execution, cache hits on re-submission, and correct cache misses for new arguments.
Give an overall description of the changes, along with the context and motivation.
Mention relevant issues and pull requests as needed.
Merge Checklist
The following items must be completed before PRs can be merged.
Check these off to verify you have completed all steps.
make testRun local tests prior to pushing.make formatFormat source code to comply with lint policies. Note that some lint errors can only be resolved manually (e.g., Python)make lintRun lint on source code prior to pushing.