fix: implement thread-safe in-memory cache in data_loader.py#642
Merged
komalharshita merged 3 commits intoJun 5, 2026
Conversation
_projects_cache was declared but load_all_projects() never read or wrote it, causing a redundant disk read on every request. Added double-checked locking with threading.Lock so the JSON file is read once and reused safely across concurrent requests. clear_cache() now acquires the same lock before resetting.
|
@anshul23102 is attempting to deploy a commit to the komalsony234-1530's projects Team on Vercel. A member of the Team first needs to authorize it. |
Contributor
Author
|
@komalharshita this PR is ready for review and all CI checks pass. Could you please add the relevant labels? It helps with tracking. Thank you! |
Contributor
Author
|
Hi @komalharshita, just a gentle check-in on this PR. It has been a couple of days since the last activity. Happy to make any changes if you have feedback. Thanks for your time! |
Contributor
Author
|
Gentle ping -- this PR has been open for 2 days with no activity. Could you please review it when you get a chance? Happy to make any adjustments. |
Merges upstream single-check cache with thread-safe double-checked locking. Keeps the _cache_lock guard to prevent duplicate file reads under concurrent requests.
Owner
|
Looks safe to merge noq |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
utils/data_loader.pydeclared_projects_cache = Nonebutload_all_projects()never read or wrote that variable. Every call toload_all_projects()opened and parsedprojects.jsonfrom disk unconditionally. Routes that touch/,/api/recommend, and/project/<id>each trigger at least one redundant file read per request.Root Cause
Related Issue
Closes #271
Type of Change
Changes Made
utils/data_loader.py:_cache_lock = threading.Lock()for thread safety.load_all_projects()so the file is read at most once per process lifetime.clear_cache()acquires the lock before resetting_projects_cache = Noneto prevent partial reads during cache invalidation.Testing Done
load_all_projects(): file is read and result is cached.clear_cache(): resets the cache; next call re-reads the file.Checklist