Provider-agnostic sliding window rate limiter for LLM API calls. Enforces requests per minute (RPM), tokens per minute (TPM), and requests per day (RPD) quotas per model using an in-memory sliding window. Ships with default quota profiles for Gemini, OpenAI, Anthropic, and a local inference provider. State persists across process restarts via YAML (single-process) or SQLite (multi-process, WAL mode). Includes a Gemini-specific token counting helper and a YAML-to-SQLite migration path.