.. _mem_cache: In-memory weights caching =========================== *New in version 0.5.0.* .. note:: This caching is only related to the :ref:`precomputed ` and :ref:`precomputed-local ` backends in :func:`regrid`. Purpose ------- earthkit-regrid provides an **in-memory cache** for interpolation pre-computed interpolation weights. When it is enabled, weights loaded from the disk are stored in memory and when we call :func:`regrid` with the same grids they do not have to be loaded from disk again. The cache can be configured to have a maximum size and eviction policy. .. note:: Please note that the earthkit-regrid in-memory cache configuration is managed through the :doc:`config`. .. _mem_cache_policies: In-memory cache policies ---------------------------- The primary config option to control the in-memory cache is ``weights-memory-cache-policy``, which can take the following values: - :ref:`largest ` (default) - :ref:`lru ` - :ref:`unlimited ` - :ref:`off ` .. _largest_mem_cache_policy: Largest cache policy ++++++++++++++++++++++ When the ``weights-memory-cache-policy`` is "largest" first evicts the largest matrices from the in-memory cache (default). The cache eviction policy is applied before loading the weights to ensure that it will fit into the cache. When it is not possible the behaviour depends on the :ref:`weights-memory-cache-strict-mode ` option. The maximum memory size of the in-memory cache is defined by the :ref:`maximum-weights-memory-cache-size ` option. The default is 500 MB. .. code-block:: python >>> from earthkit.regrid import cache, config >>> config.set("weights-memory-cache-policy", "user") >>> config.get("weights-memory-cache-policy") 'user' >>> config.get("maximum-weights-memory-cache-size") 524288000 >>> config.get("weights-memory-cache-strict-mode") False .. _lru_mem_cache_policy: LRU cache policy ++++++++++++++++++++++ When the ``weights-memory-cache-policy`` is "lru" first evicts the least recently used matrices from the in-memory cache. The cache eviction policy is applied before loading the weights to ensure that it will fit into the cache. When it is not possible the behaviour depends on the :ref:`weights-memory-cache-strict-mode ` option. The maximum memory size of the in-memory cache is defined by the :ref:`maximum-weights-memory-cache-size ` option. The default is 500 MB. .. code-block:: python >>> from earthkit.regrid import cache, config >>> config.set("weights-memory-cache-policy", "lru") >>> config.get("weights-memory-cache-policy") 'lru' >>> config.get("maximum-weights-memory-cache-size") 524288000 >>> config.get("weights-memory-cache-strict-mode") False .. _unlimited_mem_cache_policy: Unlimited cache policy ++++++++++++++++++++++ When the ``weights-memory-cache-policy`` is "unlimited" will keep all matrices in memory. .. code-block:: python >>> from earthkit.regrid import cache, config >>> config.set("weights-memory-cache-policy", "unlimited") >>> config.get("weights-memory-cache-policy") 'unlimited' .. _off_mem_cache_policy: Off cache policy ++++++++++++++++++++++ When the ``weights-memory-cache-policy`` is "off" there is no cache, the matrices are always loaded from disk. .. code-block:: python >>> from earthkit.regrid import cache, config >>> config.set("weights-memory-cache-policy", "off") >>> config.get("weights-memory-cache-policy") 'off' .. _mem_cache_state: Getting the state of the in-memory cache ------------------------------------------ The current status of the in-memory cache can be retrieved using the :func:`memory_cache_info` function. It returns a namedtuple with fields ``hits``, ``misses``, ``maxsize``, ``currsize``, ``count`` and ``policy``. .. code:: python >>> from earthkit.regrid import memory_cache_info >>> memory_cache_info() CacheInfo(hits=9, misses=1, maxsize=524288000, currsize=259170724, count=1, policy='largest') .. _mem_cache_clear: Clearing the in-memory cache ----------------------------- The in-memory cache can be cleared using the :func:`clear_memory_cache` function. .. code:: python >>> from earthkit.regrid import clear_memory_cache >>> clear_memory_cache() >>> memory_cache_info() CacheInfo(hits=0, misses=0, maxsize=524288000, currsize=0, count=0, policy='largest') .. _mem_cache_limits: In-memory cache limits ---------------------------- .. warning:: These config options are only used when ``weights-memory-cache-policy`` is :ref:`largest ` or :ref:`lru `. maximum-weights-memory-cache-size The ``maximum-weights-memory-cache-size`` option defines the maximum memory size of the in-memory cache in bytes. The default is 500 MB. weights-memory-cache-strict-mode When the ``weights-memory-cache-strict-mode`` option is ``True``, raises ValueError if the weights cannot be fitted into the cache. If ``False`` and the weights cannot be fitted into the cache it simply does not load the weights into the cache. The default is ``False``. .. _mem_cache_config: In-memory cache config parameters ------------------------------------ .. module-output:: generate_config_rst weights-memory-cache-policy maximum-weights-memory-cache-size weights-memory-cache-strict-mode Other earthkit-regrid config options can be found :ref:`here `. Notebooks --------- - :ref:`/examples/memory_cache.ipynb` Examples -------- .. code-block:: python import numpy as np from earthkit.regrid import regrid, config # set memory cache with a maximum size of 100 MB to evict the largest matrices first config.set( weights_memory_cache_policy="largest", maximum_weights_memory_cache_size=100 * 1024**2, ) print(memory_cache_info()) # create a random data array and regrid it data = np.random.rand(5248) interpolated_data = regrid(data, in_grid={"grid": "O32"}, out_grid={"grid": [5, 5]}) print(memory_cache_info()) # repeat interpolation, this time the weights are loaded from the cache data = np.random.rand(5248) interpolated_data = regrid(data, in_grid={"grid": "O32"}, out_grid={"grid": [5, 5]}) print(memory_cache_info()) output: :: CacheInfo(hits=0, misses=0, maxsize=104857600, currsize=0, count=0, policy='largest')) CacheInfo(hits=0, misses=1, maxsize=104857600, currsize=102340, count=1, policy='largest') CacheInfo(hits=1, misses=1, maxsize=104857600, currsize=102340, count=1, policy='largest')