go-ratelimit/docs/history.md
Virgil ed5949ec3a
All checks were successful
Security Scan / security (push) Successful in 9s
Test / test (push) Successful in 2m19s
feat(ratelimit): add agent decision guidance
Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-30 08:16:44 +00:00

187 lines
8.3 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!-- SPDX-License-Identifier: EUPL-1.2 -->
# Project History
## Origin
go-ratelimit was extracted from the `pkg/ratelimit` package inside
`forge.lthn.ai/core/go` on 19 February 2026. The package now lives at
`dappco.re/go/core/go-ratelimit`, with its own repository and independent
development cadence.
Initial commit: `fa1a6fc``feat: extract go-ratelimit from core/go pkg/ratelimit`
At extraction the package implemented:
- Sliding window rate limiter with 1-minute window
- Daily request caps per model
- Token counting via Google `CountTokens` API
- Hardcoded Gemini quota defaults (`gemini-3-pro-preview`: 150 RPM / 1M TPM / 1000 RPD)
- YAML persistence to `~/.core/ratelimits.yaml`
- Single test file with basic sliding window and quota enforcement tests
---
## Phase 0 — Hardening and Test Coverage
Commit: `3c63b10``feat(ratelimit): generalise beyond Gemini with provider profiles and push coverage to 95%`
Supplementary commit: `db958f2``test: expand race coverage and benchmarks`
Coverage increased from 77.1% to above the 95% floor. The test suite was rewritten using
testify with table-driven subtests throughout.
### Tests added
- `TestCanSend` — boundary conditions at exact RPM, TPM, and RPD limits;
RPM-only and TPM-only quotas; zero-token estimates; unknown and unlimited models
- `TestPrune` — pruning of old entries, retention of recent entries, daily reset
at 24-hour boundary, no-op on non-existent model, boundary-exact timestamps
- `TestRecordUsage` — fresh state, accumulation, insertion into existing state
- `TestReset` — single model, all models (empty string argument), non-existent model
- `TestWaitForCapacity` — context cancellation, pre-cancelled context,
immediate capacity, unknown model
- `TestStats` / `TestAllStats` — known, unknown, and quota-only models; pruning
and daily reset inside `AllStats()`
- `TestPersistAndLoad` — round-trip, missing file, corrupt YAML, unreadable file,
nested directory creation, unwritable directory
- `TestConcurrentAccess` — 20 goroutines x 50 ops each (CanSend + RecordUsage + Stats)
- `TestConcurrentResetAndRecord` — concurrent Reset + RecordUsage + AllStats
- `TestConcurrentMultipleModels` — 5 models, concurrent access
- `TestConcurrentPersistAndLoad` — filesystem race between Persist and Load
- `TestConcurrentWaitForCapacityAndRecordUsage` — WaitForCapacity racing RecordUsage
### Benchmarks added
- `BenchmarkCanSend` — 1,000-entry sliding window
- `BenchmarkRecordUsage`
- `BenchmarkCanSendConcurrent` — parallel goroutines
- `BenchmarkCanSendWithPrune` — 500 old + 500 new entries
- `BenchmarkStats` — 1,000-entry window
- `BenchmarkAllStats` — 5 models x 200 entries
- `BenchmarkPersist` — YAML I/O
`go test -race ./...` passed clean. `go vet ./...` produced no warnings.
---
## Phase 1 — Generalisation Beyond Gemini
Commit: `3c63b10` — included in the same commit as Phase 0
The hardcoded Gemini quotas in `New()` were replaced with a provider-agnostic
configuration system without breaking the existing API.
### New types and functions
- `Provider` string type with constants: `ProviderGemini`, `ProviderOpenAI`,
`ProviderAnthropic`, `ProviderLocal`
- `ProviderProfile` — bundles a provider identifier with its model quota map
- `Config` — construction configuration accepting `FilePath`, `Backend`,
`Providers`, and `Quotas` fields
- `DefaultProfiles()` — returns fresh pre-configured profiles for all four providers
- `NewWithConfig(Config)` — creates a limiter from explicit configuration
- `SetQuota(model, quota)` — runtime quota modification, mutex-protected
- `AddProvider(provider)` — loads all default quotas for a provider at runtime,
additive, mutex-protected
### Backward compatibility
`New()` delegates to `NewWithConfig(Config{Providers: []Provider{ProviderGemini}})`.
`TestNewBackwardCompatibility` asserts exact parity with the original hardcoded
values. No existing call sites required modification.
### Design decision: merge-on-top
Explicit `Config.Quotas` override provider profile defaults. This allows callers
to use a provider profile for most models while customising specific model limits
without forking the entire profile.
---
## Phase 2 — SQLite Persistent State
Commit: `1afb1d6``feat(persist): Phase 2 — SQLite backend with WAL mode`
The YAML backend serialises the full state on every `Persist()` call and is
not safe for concurrent multi-process access. Phase 2 added a SQLite backend
using `modernc.org/sqlite` (pure Go, no CGO) following the go-store pattern
established elsewhere in the ecosystem.
### New constructors
- `NewWithSQLite(dbPath string)` — SQLite-backed limiter with Gemini defaults
- `NewWithSQLiteConfig(dbPath string, cfg Config)` — SQLite-backed with custom config
- `Close() error` — releases the database connection; no-op on YAML-backed limiters
### Migration
- `MigrateYAMLToSQLite(yamlPath, sqlitePath string) error` — one-shot migration
helper that reads an existing YAML state file and writes all quotas and usage
state to a new SQLite database
### SQLite connection settings
- `PRAGMA journal_mode=WAL` — enables concurrent reads alongside a single writer
- `PRAGMA busy_timeout=5000` — 5-second wait on lock contention before returning an error
- `db.SetMaxOpenConns(1)` — single connection for PRAGMA consistency
### Tests added (sqlite_test.go)
- `TestNewSQLiteStore_Good / _Bad` — creation and invalid path handling
- `TestSQLiteQuotasRoundTrip_Good` — save/load round-trip
- `TestSQLite_QuotasOverwrite_Good` — the latest quota snapshot replaces previous rows
- `TestSQLiteStateRoundTrip_Good` — multi-model state with nanosecond precision
- `TestSQLiteStateOverwrite_Good` — delete-then-insert atomicity
- `TestSQLiteEmptyState_Good` — fresh database returns empty maps
- `TestNewWithSQLite_Good / TestNewWithSQLiteConfig_Good` — constructor tests
- `TestSQLitePersistAndLoad_Good` — full persist + reload cycle
- `TestSQLitePersistMultipleModels_Good` — multi-provider persistence
- `TestSQLiteConcurrent_Good` — 10 goroutines x 20 ops, race-clean
- `TestYAMLBackwardCompat_Good` — existing YAML tests pass unchanged
- `TestMigrateYAMLToSQLite_Good / _Bad` — migration round-trip and error paths
- `TestSQLiteCorruptDB_Ugly / TestSQLiteTruncatedDB_Ugly` — graceful corrupt DB recovery
- `TestSQLiteEndToEnd_Good` — full two-session scenario
---
## Phase 3 — Integration (Planned)
Not yet implemented. Intended downstream integrations:
- Wire into `go-ml` backends so rate limiting is enforced automatically on
inference calls without caller involvement
- Wire into the `go-ai` facade so all providers share a single rate limit layer
- Export metrics (requests/minute, tokens/minute, rejection counts) for
monitoring dashboards
---
## Known Limitations
**CountTokens URL is hardcoded.** The exported `CountTokens` helper calls
`generativelanguage.googleapis.com` directly. Callers cannot redirect it to
Gemini-compatible proxies or alternate endpoints without going through an
internal helper or refactoring the API to accept a base URL or `http.Client`.
**saveState is a full table replace.** On every `Persist()` call, the `requests`,
`tokens`, and `daily` tables are truncated and rewritten. For a limiter tracking
many models with high RPM, this means writing hundreds of rows on every persist
call. A future optimisation would use incremental writes (insert-only, with
periodic vacuuming of expired rows).
**No TTL on SQLite rows.** Historical rows older than one minute are pruned from
the in-memory `UsageStats` on every operation but are written wholesale to
SQLite on `Persist()`. The database does not grow unboundedly between persist
cycles because `saveState` replaces all rows, but if `Persist()` is called
frequently the WAL file can grow transiently.
**WaitForCapacity now sleeps using `Decide`s `RetryAfter` hint** (with a
one-second fallback when no hint exists). This reduces busy looping on long
windows but remains coarse for sub-second smoothing; callers that need
sub-second pacing should implement their own loop.
**No automatic persistence.** `Persist()` must be called explicitly. If a
process exits without calling `Persist()`, any usage recorded since the last
persist is lost. Callers are responsible for calling `Persist()` at appropriate
intervals (e.g., after each `RecordUsage()` call, or on a ticker).