Unify LRU memory-limiting caches into one generic cache#22613
Conversation
f0af85c to
8283e19
Compare
|
Thank you for opening this pull request! Reviewer note: cargo-semver-checks reported the current version number is not SemVer-compatible with the changes in this pull request (compared against the base branch). Details |
29d2f82 to
5920922
Compare
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing generic-cache-1 (5920922) to 32a1fe5 (merge-base) diff using: clickbench_partitioned File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing generic-cache-1 (5920922) to 32a1fe5 (merge-base) diff using: tpcds File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing generic-cache-1 (5920922) to 32a1fe5 (merge-base) diff using: tpch File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagetpch — base (merge-base)
tpch — branch
File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagetpcds — base (merge-base)
tpcds — branch
File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usageclickbench_partitioned — base (merge-base)
clickbench_partitioned — branch
File an issue against this benchmark runner |
|
run benchmark clickbench_partitioned |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing generic-cache-1 (5920922) to 32a1fe5 (merge-base) diff using: clickbench_partitioned File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usageclickbench_partitioned — base (merge-base)
clickbench_partitioned — branch
File an issue against this benchmark runner |
|
@nuno-faria Do you have maybe time for a review for this? Thanks |
nuno-faria
left a comment
There was a problem hiding this comment.
Thanks @mkleen, overall I think this makes the cache manager easier to understand and extend in the future. The benchmarks appear unchanged as well which is good. I think it's best to move the tests in file_metadata_cache.rs, ..., in a follow-up PR.
In addition to the comments below, this needs an entry in the upgrade guide to explain what needs to be migrated.
I also would like someone like @alamb or @adriangb to take a look as well if they can.
| @@ -380,23 +333,21 @@ impl CacheManager { | |||
| } | |||
| Some(Arc::clone(lfc)) | |||
| } | |||
| None if config.list_files_cache_limit > 0 => { | |||
| let lfc: Arc<dyn ListFilesCache> = Arc::new(DefaultListFilesCache::new( | |||
| None if config.list_files_cache_limit > 0 => Some(Arc::new( | |||
| DefaultCache::<TableScopedPath, CachedFileList>::with_ttl( | |||
| config.list_files_cache_limit, | |||
| config.list_files_cache_ttl, | |||
| )); | |||
| Some(lfc) | |||
| } | |||
| ) | |||
| .with_name("DefaultListFilesCache"), | |||
| )), | |||
| _ => None, | |||
| }; | |||
There was a problem hiding this comment.
Can the creation of these two caches follow the pattern used by file_metadata_cache? I think it would make it easier to understand.
There was a problem hiding this comment.
The semantics of file_metadata_cache is different to list_file_cache and the file_statistics_cache. list_file_cache and file_statistics_cache are optional and should be disabled resulting in None when the cache limit is 0. Therefore it cannot follow the same pattern.
|
@nuno-faria Thanks a lot for your time for this review! |
d41dd1b to
f1fa1cd
Compare
| @@ -96,3 +96,19 @@ as a supertrait: | |||
| - pub trait QueryPlanner: Debug | |||
| + pub trait QueryPlanner: Any + Debug | |||
There was a problem hiding this comment.
This is not part of this edit.
Which issue does this PR close?
Rationale for this change
This PR introduces a new cache which merges the functionality of file-metadata cache, list-files cache and file-statistics cache into one generic implementation. This removes a lot of redundant code.
What changes are included in this PR?
Introduce a generic
DefaultCachewith LRU eviction, memory-limit and TTL.Migrate all cache tests to use the new
DefaultCache.Replace file-metadata cache, list-files-cache and file-statistics-cache implementations usage with the new generic version.
Are these changes tested?
Yes. All previous cache tests are migrated to the new cache and passing. They had to be slighlty adapted because the new implementation also counts the cache-key for memory accounting which wasn't the case for all previous implementations. The tests are still at the same location to have a diff for reviews.
Once the tests are reviewed and approved they should probably move to
default_cache.rsand the filesfile_statistics_cache.rs,file_metadata_cache.rsandlist_files_cache.rscan be removed.Are there any user-facing changes?
The traits
FileStatisticsCache,ListFilesCacheandFileMetadataCacheare replaced with the typesCache<TableScopedPath, CachedFileMetadata>,Cache<TableScopedPath, CachedFileList>andCache<Path, CachedFileMetadataEntry>.