In-memory weights caching
New in version 0.5.0.
Note
This caching is only related to the precomputed
and precomputed-local backends in regrid().
Purpose
earthkit-regrid provides an in-memory cache for interpolation pre-computed interpolation weights. When it is enabled, weights loaded from the disk are stored in memory and when we call regrid() with the same grids they do not have to be loaded from disk again. The cache can be configured to have a maximum size and eviction policy.
Note
Please note that the earthkit-regrid in-memory cache configuration is managed through the Configuration.
In-memory cache policies
The primary config option to control the in-memory cache is weights-memory-cache-policy, which can take the following values:
Largest cache policy
When the weights-memory-cache-policy is “largest” first evicts the largest matrices from the in-memory cache (default). The cache eviction policy is applied before loading the weights to ensure that it will fit into the cache. When it is not possible the behaviour depends on the weights-memory-cache-strict-mode option. The maximum memory size of the in-memory cache is defined by the maximum-weights-memory-cache-size option. The default is 500 MB.
>>> from earthkit.regrid import cache, config
>>> config.set("weights-memory-cache-policy", "user")
>>> config.get("weights-memory-cache-policy")
'user'
>>> config.get("maximum-weights-memory-cache-size")
524288000
>>> config.get("weights-memory-cache-strict-mode")
False
LRU cache policy
When the weights-memory-cache-policy is “lru” first evicts the least recently used matrices from the in-memory cache. The cache eviction policy is applied before loading the weights to ensure that it will fit into the cache. When it is not possible the behaviour depends on the weights-memory-cache-strict-mode option. The maximum memory size of the in-memory cache is defined by the maximum-weights-memory-cache-size option. The default is 500 MB.
>>> from earthkit.regrid import cache, config
>>> config.set("weights-memory-cache-policy", "lru")
>>> config.get("weights-memory-cache-policy")
'lru'
>>> config.get("maximum-weights-memory-cache-size")
524288000
>>> config.get("weights-memory-cache-strict-mode")
False
Unlimited cache policy
When the weights-memory-cache-policy is “unlimited” will keep all matrices in memory.
>>> from earthkit.regrid import cache, config
>>> config.set("weights-memory-cache-policy", "unlimited")
>>> config.get("weights-memory-cache-policy")
'unlimited'
Off cache policy
When the weights-memory-cache-policy is “off” there is no cache, the matrices are always loaded from disk.
>>> from earthkit.regrid import cache, config
>>> config.set("weights-memory-cache-policy", "off")
>>> config.get("weights-memory-cache-policy")
'off'
Getting the state of the in-memory cache
The current status of the in-memory cache can be retrieved using the memory_cache_info() function. It returns a namedtuple with fields hits, misses, maxsize, currsize, count and policy.
>>> from earthkit.regrid import memory_cache_info
>>> memory_cache_info()
CacheInfo(hits=9, misses=1, maxsize=524288000, currsize=259170724, count=1, policy='largest')
Clearing the in-memory cache
The in-memory cache can be cleared using the clear_memory_cache() function.
>>> from earthkit.regrid import clear_memory_cache
>>> clear_memory_cache()
>>> memory_cache_info()
CacheInfo(hits=0, misses=0, maxsize=524288000, currsize=0, count=0, policy='largest')
In-memory cache limits
- maximum-weights-memory-cache-size
The
maximum-weights-memory-cache-sizeoption defines the maximum memory size of the in-memory cache in bytes. The default is 500 MB.- weights-memory-cache-strict-mode
When the
weights-memory-cache-strict-modeoption isTrue, raises ValueError if the weights cannot be fitted into the cache. IfFalseand the weights cannot be fitted into the cache it simply does not load the weights into the cache. The default isFalse.
In-memory cache config parameters
Name |
Default |
Description |
|---|---|---|
maximum‑weights‑memory‑cache‑size |
‘500MB’ |
The maximum memory size of the in-memory precomputed weight cache in bytes.
Only used when |
weights‑memory‑cache‑policy |
‘largest’ |
The in-memory precomputed weights cache policy. Valid values: off, unlimited, largest and lru. See In-memory weights caching for more information. |
weights‑memory‑cache‑strict‑mode |
False |
Raise exception if the weights cannot be fitted into the in-memory cache.
Only used when |
Other earthkit-regrid config options can be found here.
Notebooks
Examples
import numpy as np
from earthkit.regrid import regrid, config
# set memory cache with a maximum size of 100 MB to evict the largest matrices first
config.set(
weights_memory_cache_policy="largest",
maximum_weights_memory_cache_size=100 * 1024**2,
)
print(memory_cache_info())
# create a random data array and regrid it
data = np.random.rand(5248)
interpolated_data = regrid(data, in_grid={"grid": "O32"}, out_grid={"grid": [5, 5]})
print(memory_cache_info())
# repeat interpolation, this time the weights are loaded from the cache
data = np.random.rand(5248)
interpolated_data = regrid(data, in_grid={"grid": "O32"}, out_grid={"grid": [5, 5]})
print(memory_cache_info())
output:
CacheInfo(hits=0, misses=0, maxsize=104857600, currsize=0, count=0, policy='largest'))
CacheInfo(hits=0, misses=1, maxsize=104857600, currsize=102340, count=1, policy='largest')
CacheInfo(hits=1, misses=1, maxsize=104857600, currsize=102340, count=1, policy='largest')