APM

>Agent Skill

@anhvth/caching-utilities

skillcode-quality

Guide for using caching utilities in speedy_utils, including memory, disk, and hybrid caching strategies for sync and async functions.

pythondocumentationperformance
apm::install
$apm install @anhvth/caching-utilities
apm::skill.md
---
name: 'caching-utilities'
description: 'Guide for using caching utilities in speedy_utils, including memory, disk, and hybrid caching strategies for sync and async functions.'
---

# Caching Utilities Guide

This skill provides comprehensive guidance for using the caching utilities in `speedy_utils`.

## When to Use This Skill

Use this skill when you need to:
- Optimize performance by caching expensive function calls.
- Persist results across program runs using disk caching.
- Use memory caching for fast access within a single run.
- Handle caching for both synchronous and asynchronous functions.
- Use `imemoize` for persistent caching in interactive environments like Jupyter notebooks.

## Prerequisites

- `speedy_utils` installed in your environment.

## Core Capabilities

### Universal Memoization (`@memoize`)
- Supports `memory`, `disk`, and `both` (hybrid) caching backends.
- Works with both `sync` and `async` functions.
- Configurable LRU cache size for memory caching.
- Custom key generation strategies.

### Interactive Memoization (`@imemoize`)
- Designed for Jupyter notebooks and interactive sessions.
- Persists cache across module reloads (`%load`).
- Uses global memory cache.

### Object Identification (`identify`)
- Generates stable, content-based identifiers for arbitrary Python objects.
- Handles complex types like DataFrames, Pydantic models, and nested structures.

## Usage Examples

### Example 1: Basic Hybrid Caching
Cache results in both memory and disk.

```python
from speedy_utils import memoize
import time

@memoize(cache_type='both', size=128)
def expensive_func(x: int):
    time.sleep(1)
    return x * x
```

### Example 2: Async Disk Caching
Cache results of an async function to disk.

```python
from speedy_utils import memoize
import asyncio

@memoize(cache_type='disk', cache_dir='./my_cache')
async def fetch_data(url: str):
    # simulate network call
    await asyncio.sleep(1)
    return {"data": "content"}
```

### Example 3: Custom Key Function
Use a custom key function for complex arguments.

```python
from speedy_utils import memoize

def get_user_id(user):
    return user.id

@memoize(key=get_user_id)
def process_user(user):
    # ...
    pass
```

### Example 4: Interactive Caching (Notebooks)
Use `@imemoize` to keep cache even if you reload the cell/module.

```python
from speedy_utils import imemoize

@imemoize
def notebook_func(data):
    # ...
    return result
```

## Guidelines

1.  **Choose the Right Backend**:
    - Use `memory` for small, fast results needed frequently in one session.
    - Use `disk` for large results or to persist across runs.
    - Use `both` (default) for the best of both worlds.

2.  **Key Stability**:
    - Ensure arguments are stable (e.g., avoid using objects with changing internal state as keys unless you provide a custom `key` function).
    - `identify` handles most common types, but be careful with custom classes without `__repr__` or stable serialization.

3.  **Cache Directory**:
    - Default disk cache is `~/.cache/speedy_cache`.
    - Override `cache_dir` for project-specific caching.

4.  **Async Support**:
    - The decorators automatically detect `async` functions and handle `await` correctly.
    - Do not mix sync/async usage without proper `await`.

## Common Patterns

### Pattern: Ignoring `self`
By default, `ignore_self=True` is set. This means methods on different instances of the same class will share cache if other arguments are the same. Set `ignore_self=False` if the instance state matters.

```python
class Processor:
    def __init__(self, multiplier):
        self.multiplier = multiplier

    @memoize(ignore_self=False)
    def compute(self, x):
        return x * self.multiplier
```

## Limitations

- **Pickle Compatibility**: Disk caching relies on `pickle` (or JSON). Ensure return values are serializable.
- **Cache Invalidation**: There is no automatic TTL (Time To Live) or expiration. You must manually clear cache files if data becomes stale.