immich/machine-learning/app/models/cache.py

from typing import Any

from aiocache.backends.memory import SimpleMemoryCache
from aiocache.lock import OptimisticLock
from aiocache.plugins import TimingPlugin

from app.models import from_model_type
from app.models.base import InferenceModel

from ..schemas import ModelTask, ModelType, has_profiling


class ModelCache:
    """Fetches a model from an in-memory cache, instantiating it if it's missing."""

    def __init__(
        self,
        revalidate: bool = False,
        timeout: int | None = None,
        profiling: bool = False,
    ) -> None:
        """
        Args:
            revalidate: Resets TTL on cache hit. Useful to keep models in memory while active. Defaults to False.
            timeout: Maximum allowed time for model to load. Disabled if None. Defaults to None.
            profiling: Collects metrics for cache operations, adding slight overhead. Defaults to False.
        """

        plugins = []

        if profiling:
            plugins.append(TimingPlugin())

        self.should_revalidate = revalidate

        self.cache = SimpleMemoryCache(timeout=timeout, plugins=plugins, namespace=None)

    async def get(
        self, model_name: str, model_type: ModelType, model_task: ModelTask, **model_kwargs: Any
    ) -> InferenceModel:
        key = f"{model_name}{model_type}{model_task}"

        async with OptimisticLock(self.cache, key) as lock:
            model: InferenceModel | None = await self.cache.get(key)
            if model is None:
                model = from_model_type(model_name, model_type, model_task, **model_kwargs)
                await lock.cas(model, ttl=model_kwargs.get("ttl", None))
            elif self.should_revalidate:
                await self.revalidate(key, model_kwargs.get("ttl", None))
        return model

    async def get_profiling(self) -> dict[str, float] | None:
        if not has_profiling(self.cache):
            return None

        return self.cache.profiling

    async def revalidate(self, key: str, ttl: int | None) -> None:
        if ttl is not None and key in self.cache._handlers:
            await self.cache.expire(key, ttl)
chore(ml): added testing and github workflow (#2969) * added testing * github action for python, made mypy happy * formatted with black * minor fixes and styling * test model cache * cache test dependencies * narrowed model cache tests * moved endpoint tests to their own class * cleaned up fixtures * formatting * removed unused dep 2023-06-27 19:21:33 -04:00			`from typing import Any`
refactor(ml): modularization and styling (#2835) * basic refactor and styling * removed batching * module entrypoint * removed unused imports * model superclass, model cache now in app state * fixed cache dir and enforced abstract method --------- Co-authored-by: Alex Tran <alex.tran1502@gmail.com> 2023-06-24 23:18:09 -04:00
feat(ml): model unloading (#2661) * model cache * fixed revalidation when using cache namespace * fixed ttl not being set, added lock 2023-06-06 21:48:51 -04:00			`from aiocache.backends.memory import SimpleMemoryCache`
			`from aiocache.lock import OptimisticLock`
feat: preloading of machine learning models (#7540) 2024-03-04 01:48:56 +01:00			`from aiocache.plugins import TimingPlugin`
refactor(ml): modularization and styling (#2835) * basic refactor and styling * removed batching * module entrypoint * removed unused imports * model superclass, model cache now in app state * fixed cache dir and enforced abstract method --------- Co-authored-by: Alex Tran <alex.tran1502@gmail.com> 2023-06-24 23:18:09 -04:00
feat(ml): export clip models to ONNX and host models on Hugging Face (#4700) * export clip models * export to hf refactored export code * export mclip, general refactoring cleanup * updated conda deps * do transforms with pillow and numpy, add tokenization config to export, general refactoring * moved conda dockerfile, re-added poetry * minor fixes * updated link * updated tests * removed `requirements.txt` from workflow * fixed mimalloc path * removed torchvision * cleaner np typing * review suggestions * update default model name * update test 2023-10-31 06:02:04 -04:00			`from app.models import from_model_type`
feat(ml): composable ml (#9973) * modularize model classes * various fixes * expose port * change response * round coordinates * simplify preload * update server * simplify interface simplify * update tests * composable endpoint * cleanup fixes remove unnecessary interface support text input, cleanup * ew camelcase * update server server fixes fix typing * ml fixes update locustfile fixes * cleaner response * better repo response * update tests formatting and typing rename * undo compose change * linting fix type actually fix typing * stricter typing fix detection-only response no need for defaultdict * update spec file update api linting * update e2e * unnecessary dimension * remove commented code * remove duplicate code * remove unused imports * add batch dim 2024-06-06 23:09:47 -04:00			`from app.models.base import InferenceModel`
feat(ml): export clip models to ONNX and host models on Hugging Face (#4700) * export clip models * export to hf refactored export code * export mclip, general refactoring cleanup * updated conda deps * do transforms with pillow and numpy, add tokenization config to export, general refactoring * moved conda dockerfile, re-added poetry * minor fixes * updated link * updated tests * removed `requirements.txt` from workflow * fixed mimalloc path * removed torchvision * cleaner np typing * review suggestions * update default model name * update test 2023-10-31 06:02:04 -04:00
feat(ml): composable ml (#9973) * modularize model classes * various fixes * expose port * change response * round coordinates * simplify preload * update server * simplify interface simplify * update tests * composable endpoint * cleanup fixes remove unnecessary interface support text input, cleanup * ew camelcase * update server server fixes fix typing * ml fixes update locustfile fixes * cleaner response * better repo response * update tests formatting and typing rename * undo compose change * linting fix type actually fix typing * stricter typing fix detection-only response no need for defaultdict * update spec file update api linting * update e2e * unnecessary dimension * remove commented code * remove duplicate code * remove unused imports * add batch dim 2024-06-06 23:09:47 -04:00			`from ..schemas import ModelTask, ModelType, has_profiling`
feat(ml): model unloading (#2661) * model cache * fixed revalidation when using cache namespace * fixed ttl not being set, added lock 2023-06-06 21:48:51 -04:00

			`class ModelCache:`
			`"""Fetches a model from an in-memory cache, instantiating it if it's missing."""`

			`def __init__(`
			`self,`
			`revalidate: bool = False,`
			`timeout: int \| None = None,`
			`profiling: bool = False,`
fix(ml): load models in separate threads (#4034) * load models in thread * set clip mode logs to debug level * updated tests * made fixtures slightly less ugly * moved responses to json file * formatting 2023-09-09 05:02:44 -04:00			`) -> None:`
feat(ml): model unloading (#2661) * model cache * fixed revalidation when using cache namespace * fixed ttl not being set, added lock 2023-06-06 21:48:51 -04:00			`"""`
			`Args:`
			`revalidate: Resets TTL on cache hit. Useful to keep models in memory while active. Defaults to False.`
			`timeout: Maximum allowed time for model to load. Disabled if None. Defaults to None.`
			`profiling: Collects metrics for cache operations, adding slight overhead. Defaults to False.`
			`"""`

			`plugins = []`

			`if profiling:`
			`plugins.append(TimingPlugin())`

feat(ml): composable ml (#9973) * modularize model classes * various fixes * expose port * change response * round coordinates * simplify preload * update server * simplify interface simplify * update tests * composable endpoint * cleanup fixes remove unnecessary interface support text input, cleanup * ew camelcase * update server server fixes fix typing * ml fixes update locustfile fixes * cleaner response * better repo response * update tests formatting and typing rename * undo compose change * linting fix type actually fix typing * stricter typing fix detection-only response no need for defaultdict * update spec file update api linting * update e2e * unnecessary dimension * remove commented code * remove duplicate code * remove unused imports * add batch dim 2024-06-06 23:09:47 -04:00			`self.should_revalidate = revalidate`
feat: preloading of machine learning models (#7540) 2024-03-04 01:48:56 +01:00
			`self.cache = SimpleMemoryCache(timeout=timeout, plugins=plugins, namespace=None)`
feat(ml): model unloading (#2661) * model cache * fixed revalidation when using cache namespace * fixed ttl not being set, added lock 2023-06-06 21:48:51 -04:00
feat(ml): composable ml (#9973) * modularize model classes * various fixes * expose port * change response * round coordinates * simplify preload * update server * simplify interface simplify * update tests * composable endpoint * cleanup fixes remove unnecessary interface support text input, cleanup * ew camelcase * update server server fixes fix typing * ml fixes update locustfile fixes * cleaner response * better repo response * update tests formatting and typing rename * undo compose change * linting fix type actually fix typing * stricter typing fix detection-only response no need for defaultdict * update spec file update api linting * update e2e * unnecessary dimension * remove commented code * remove duplicate code * remove unused imports * add batch dim 2024-06-06 23:09:47 -04:00			`async def get(`
			`self, model_name: str, model_type: ModelType, model_task: ModelTask, **model_kwargs: Any`
			`) -> InferenceModel:`
			`key = f"{model_name}{model_type}{model_task}"`
feat: preloading of machine learning models (#7540) 2024-03-04 01:48:56 +01:00
fix(ml): race condition when loading models (#3207) * sync model loading, disabled model ttl by default * disable revalidation if model unloading disabled * moved lock 2023-07-11 13:01:21 -04:00			`async with OptimisticLock(self.cache, key) as lock:`
chore(ml): use strict mypy (#5001) * improved typing * improved export typing * strict mypy & check export folder * formatting * add formatting checks for export folder * re-added init call 2023-11-13 11:18:46 -05:00			`model: InferenceModel \| None = await self.cache.get(key)`
fix(ml): race condition when loading models (#3207) * sync model loading, disabled model ttl by default * disable revalidation if model unloading disabled * moved lock 2023-07-11 13:01:21 -04:00			`if model is None:`
feat(ml): composable ml (#9973) * modularize model classes * various fixes * expose port * change response * round coordinates * simplify preload * update server * simplify interface simplify * update tests * composable endpoint * cleanup fixes remove unnecessary interface support text input, cleanup * ew camelcase * update server server fixes fix typing * ml fixes update locustfile fixes * cleaner response * better repo response * update tests formatting and typing rename * undo compose change * linting fix type actually fix typing * stricter typing fix detection-only response no need for defaultdict * update spec file update api linting * update e2e * unnecessary dimension * remove commented code * remove duplicate code * remove unused imports * add batch dim 2024-06-06 23:09:47 -04:00			`model = from_model_type(model_name, model_type, model_task, **model_kwargs)`
feat: preloading of machine learning models (#7540) 2024-03-04 01:48:56 +01:00			`await lock.cas(model, ttl=model_kwargs.get("ttl", None))`
feat(ml): composable ml (#9973) * modularize model classes * various fixes * expose port * change response * round coordinates * simplify preload * update server * simplify interface simplify * update tests * composable endpoint * cleanup fixes remove unnecessary interface support text input, cleanup * ew camelcase * update server server fixes fix typing * ml fixes update locustfile fixes * cleaner response * better repo response * update tests formatting and typing rename * undo compose change * linting fix type actually fix typing * stricter typing fix detection-only response no need for defaultdict * update spec file update api linting * update e2e * unnecessary dimension * remove commented code * remove duplicate code * remove unused imports * add batch dim 2024-06-06 23:09:47 -04:00			`elif self.should_revalidate:`
feat: preloading of machine learning models (#7540) 2024-03-04 01:48:56 +01:00			`await self.revalidate(key, model_kwargs.get("ttl", None))`
feat(ml): model unloading (#2661) * model cache * fixed revalidation when using cache namespace * fixed ttl not being set, added lock 2023-06-06 21:48:51 -04:00			`return model`

			`async def get_profiling(self) -> dict[str, float] \| None:`
chore(ml): use strict mypy (#5001) * improved typing * improved export typing * strict mypy & check export folder * formatting * add formatting checks for export folder * re-added init call 2023-11-13 11:18:46 -05:00			`if not has_profiling(self.cache):`
feat(ml): model unloading (#2661) * model cache * fixed revalidation when using cache namespace * fixed ttl not being set, added lock 2023-06-06 21:48:51 -04:00			`return None`

chore(ml): use strict mypy (#5001) * improved typing * improved export typing * strict mypy & check export folder * formatting * add formatting checks for export folder * re-added init call 2023-11-13 11:18:46 -05:00			`return self.cache.profiling`
feat(ml): model unloading (#2661) * model cache * fixed revalidation when using cache namespace * fixed ttl not being set, added lock 2023-06-06 21:48:51 -04:00
feat: preloading of machine learning models (#7540) 2024-03-04 01:48:56 +01:00			`async def revalidate(self, key: str, ttl: int \| None) -> None:`
			`if ttl is not None and key in self.cache._handlers:`
			`await self.cache.expire(key, ttl)`