The FlowKit cache scoring mechanism (borrowed from https://github.com/dask/cachey) aims to retain cache entries that are frequently and/or recently used, by calculating a cache score with a contribution from each cache touch that decays exponentially over time, with more recent touches weighted more heavily.
This is achieved by adding to a cache entry's score a contribution proportional to $(1+\varepsilon)^t$ each time the cache entry is accessed (where $t$ is the "time" of the cache touch, with "time" defined as an incrementing counter of all cache touches across all cache entries).
The problem with this approach is that the value added to cache_score_multiplier grows exponentially with the number of cache touches, so in a long-running FlowKit instance the cache scores can get very large. In the worst case, if the number of cache touches exceeds 1024 * cache_half_life(), attempting to touch a cache record will result in a double-precision overflow.
This can be mitigated by resetting the cache via flowmachine.core.cache.reset_cache(), which resets the counter. But ideally we would want to be able to keep the cache active indefinitely without resetting.
An alternative would be to scale down the previous cache score multiplier each time the cache record is touched, instead of increasing the value added for each cache touch. I.e instead of
UPDATE cache.cached
SET cache_score_multiplier = cache_score_multiplier + power( 1 + ln(2)/cache_half_life(),
nextval('cache.cache_touches') - 2
);
we could record the last cache touch counter value, and then do
SELECT nextval('cache.cache_touches') INTO this_cache_touch_number;
UPDATE cache.cached
SET cache_score_multiplier = cache_score_multiplier*power( 1-ln(2)/cache_half_life(),
this_cache_touch_number-cache_touch_number
) + 1,
cache_touch_number = this_cache_touch_number;
(or equivalently,
cache_score_multiplier = cache_score_multiplier*power( 2,
(cache_touch_number-this_cache_touch_number)/cache_half_life()
) + 1
).
This would require adding a cache_touch_number column to cache.cached, and also we would need to scale down the cache score here:
|
RETURN cache_score_multiplier*((compute_time/1000)/tablesize); |
by a factor of
power(2, (cache_touch_number - currval('cache.cache_touches'))/cache_half_life()). Alternatively we could replace
(cache_touch_number-this_cache_touch_number) with
EXTRACT('seconds' FROM last_accessed - NOW()), and do away with the
cache_touches counter altogether, so that the contribution from each cache touch decays exponentially with real time rather than total number of cache touches.
This approach is equivalent to the current one, but by normalising the cache score multipliers at each touch we prevent them from growing excessively large. There is then the possibility of underflow, instead of overflow, if a cache entry has been previously touched but not for a very long time - but in the event this occurs we could simply ignore the contribution from previous cache touches because it will be negligible compared to that from the new cache touch (i.e. set cache_score_multiplier=1).
The FlowKit cache scoring mechanism (borrowed from https://github.com/dask/cachey) aims to retain cache entries that are frequently and/or recently used, by calculating a cache score with a contribution from each cache touch that decays exponentially over time, with more recent touches weighted more heavily.
This is achieved by adding to a cache entry's score a contribution proportional to$(1+\varepsilon)^t$ each time the cache entry is accessed (where $t$ is the "time" of the cache touch, with "time" defined as an incrementing counter of all cache touches across all cache entries).
The problem with this approach is that the value added to
cache_score_multipliergrows exponentially with the number of cache touches, so in a long-running FlowKit instance the cache scores can get very large. In the worst case, if the number of cache touches exceeds1024 * cache_half_life(), attempting to touch a cache record will result in a double-precision overflow.This can be mitigated by resetting the cache via
flowmachine.core.cache.reset_cache(), which resets the counter. But ideally we would want to be able to keep the cache active indefinitely without resetting.An alternative would be to scale down the previous cache score multiplier each time the cache record is touched, instead of increasing the value added for each cache touch. I.e instead of
we could record the last cache touch counter value, and then do
(or equivalently,
).
This would require adding a
cache_touch_numbercolumn tocache.cached, and also we would need to scale down the cache score here:FlowKit/flowdb/bin/build/0030_utilities.sql
Line 270 in f664158
power(2, (cache_touch_number - currval('cache.cache_touches'))/cache_half_life()). Alternatively we could replace(cache_touch_number-this_cache_touch_number)withEXTRACT('seconds' FROM last_accessed - NOW()), and do away with thecache_touchescounter altogether, so that the contribution from each cache touch decays exponentially with real time rather than total number of cache touches.This approach is equivalent to the current one, but by normalising the cache score multipliers at each touch we prevent them from growing excessively large. There is then the possibility of underflow, instead of overflow, if a cache entry has been previously touched but not for a very long time - but in the event this occurs we could simply ignore the contribution from previous cache touches because it will be negligible compared to that from the new cache touch (i.e. set
cache_score_multiplier=1).