From fb531af79ad4c2f356109d2a12b2fe3e39e8e30c Mon Sep 17 00:00:00 2001 From: Virgil Date: Thu, 19 Feb 2026 21:36:13 +0000 Subject: [PATCH] docs: add TODO.md and FINDINGS.md for fleet delegation Co-Authored-By: Virgil --- FINDINGS.md | 23 +++++++++++++++++++++++ TODO.md | 27 +++++++++++++++++++++++++++ 2 files changed, 50 insertions(+) create mode 100644 FINDINGS.md create mode 100644 TODO.md diff --git a/FINDINGS.md b/FINDINGS.md new file mode 100644 index 0000000..9522692 --- /dev/null +++ b/FINDINGS.md @@ -0,0 +1,23 @@ +# FINDINGS.md -- go-ratelimit + +## 2026-02-19: Split from core/go (Virgil) + +### Origin + +Extracted from `forge.lthn.ai/core/go` on 19 Feb 2026. + +### Architecture + +- Sliding window rate limiter (1-minute window) +- Daily request caps per model +- Token counting via Google `CountTokens` API +- Model-specific quota configuration + +### Gemini-Specific Defaults + +- `gemini-3-pro-preview`: 150 RPM / 1M TPM / 1000 RPD +- Quotas are currently hardcoded -- needs generalisation (see TODO Phase 1) + +### Tests + +- 1 test file covering sliding window and quota enforcement diff --git a/TODO.md b/TODO.md new file mode 100644 index 0000000..71e8fb2 --- /dev/null +++ b/TODO.md @@ -0,0 +1,27 @@ +# TODO.md -- go-ratelimit + +## Phase 1: Generalise Beyond Gemini + +- [ ] Hardcoded model quotas are Gemini-specific -- abstract to provider-agnostic config +- [ ] Add quota profiles for OpenAI, Anthropic, and local (Ollama/MLX) backends +- [ ] Make default quotas configurable via YAML or environment variables + +## Phase 2: Persistent State + +- [ ] Currently stores state in YAML file -- not safe for multi-process access +- [ ] Consider SQLite for concurrent read/write safety (WAL mode) +- [ ] Add state recovery on restart (reload sliding window from persisted data) + +## Phase 3: Integration + +- [ ] Wire into go-ml backends for automatic rate limiting on inference calls +- [ ] Wire into go-ai facade so all providers share a unified rate limit layer +- [ ] Add metrics export (requests/minute, tokens/minute, rejections) for monitoring + +--- + +## Workflow + +1. Virgil in core/go writes tasks here after research +2. This repo's dedicated session picks up tasks in phase order +3. Mark `[x]` when done, note commit hash