go-ratelimit/docs/history.md

188 lines
8.3 KiB
Markdown
Raw Normal View History

<!-- SPDX-License-Identifier: EUPL-1.2 -->
# Project History
## Origin
go-ratelimit was extracted from the `pkg/ratelimit` package inside
`forge.lthn.ai/core/go` on 19 February 2026. The package now lives at
`dappco.re/go/core/go-ratelimit`, with its own repository and independent
development cadence.
Initial commit: `fa1a6fc``feat: extract go-ratelimit from core/go pkg/ratelimit`
At extraction the package implemented:
- Sliding window rate limiter with 1-minute window
- Daily request caps per model
- Token counting via Google `CountTokens` API
- Hardcoded Gemini quota defaults (`gemini-3-pro-preview`: 150 RPM / 1M TPM / 1000 RPD)
- YAML persistence to `~/.core/ratelimits.yaml`
- Single test file with basic sliding window and quota enforcement tests
---
## Phase 0 — Hardening and Test Coverage
Commit: `3c63b10``feat(ratelimit): generalise beyond Gemini with provider profiles and push coverage to 95%`
Supplementary commit: `db958f2``test: expand race coverage and benchmarks`
Coverage increased from 77.1% to above the 95% floor. The test suite was rewritten using
testify with table-driven subtests throughout.
### Tests added
- `TestCanSend` — boundary conditions at exact RPM, TPM, and RPD limits;
RPM-only and TPM-only quotas; zero-token estimates; unknown and unlimited models
- `TestPrune` — pruning of old entries, retention of recent entries, daily reset
at 24-hour boundary, no-op on non-existent model, boundary-exact timestamps
- `TestRecordUsage` — fresh state, accumulation, insertion into existing state
- `TestReset` — single model, all models (empty string argument), non-existent model
- `TestWaitForCapacity` — context cancellation, pre-cancelled context,
immediate capacity, unknown model
- `TestStats` / `TestAllStats` — known, unknown, and quota-only models; pruning
and daily reset inside `AllStats()`
- `TestPersistAndLoad` — round-trip, missing file, corrupt YAML, unreadable file,
nested directory creation, unwritable directory
- `TestConcurrentAccess` — 20 goroutines x 50 ops each (CanSend + RecordUsage + Stats)
- `TestConcurrentResetAndRecord` — concurrent Reset + RecordUsage + AllStats
- `TestConcurrentMultipleModels` — 5 models, concurrent access
- `TestConcurrentPersistAndLoad` — filesystem race between Persist and Load
- `TestConcurrentWaitForCapacityAndRecordUsage` — WaitForCapacity racing RecordUsage
### Benchmarks added
- `BenchmarkCanSend` — 1,000-entry sliding window
- `BenchmarkRecordUsage`
- `BenchmarkCanSendConcurrent` — parallel goroutines
- `BenchmarkCanSendWithPrune` — 500 old + 500 new entries
- `BenchmarkStats` — 1,000-entry window
- `BenchmarkAllStats` — 5 models x 200 entries
- `BenchmarkPersist` — YAML I/O
`go test -race ./...` passed clean. `go vet ./...` produced no warnings.
---
## Phase 1 — Generalisation Beyond Gemini
Commit: `3c63b10` — included in the same commit as Phase 0
The hardcoded Gemini quotas in `New()` were replaced with a provider-agnostic
configuration system without breaking the existing API.
### New types and functions
- `Provider` string type with constants: `ProviderGemini`, `ProviderOpenAI`,
`ProviderAnthropic`, `ProviderLocal`
- `ProviderProfile` — bundles a provider identifier with its model quota map
- `Config` — construction configuration accepting `FilePath`, `Backend`,
`Providers`, and `Quotas` fields
- `DefaultProfiles()` — returns fresh pre-configured profiles for all four providers
- `NewWithConfig(Config)` — creates a limiter from explicit configuration
- `SetQuota(model, quota)` — runtime quota modification, mutex-protected
- `AddProvider(provider)` — loads all default quotas for a provider at runtime,
additive, mutex-protected
### Backward compatibility
`New()` delegates to `NewWithConfig(Config{Providers: []Provider{ProviderGemini}})`.
`TestNewBackwardCompatibility` asserts exact parity with the original hardcoded
values. No existing call sites required modification.
### Design decision: merge-on-top
Explicit `Config.Quotas` override provider profile defaults. This allows callers
to use a provider profile for most models while customising specific model limits
without forking the entire profile.
---
## Phase 2 — SQLite Persistent State
Commit: `1afb1d6``feat(persist): Phase 2 — SQLite backend with WAL mode`
The YAML backend serialises the full state on every `Persist()` call and is
not safe for concurrent multi-process access. Phase 2 added a SQLite backend
using `modernc.org/sqlite` (pure Go, no CGO) following the go-store pattern
established elsewhere in the ecosystem.
### New constructors
- `NewWithSQLite(dbPath string)` — SQLite-backed limiter with Gemini defaults
- `NewWithSQLiteConfig(dbPath string, cfg Config)` — SQLite-backed with custom config
- `Close() error` — releases the database connection; no-op on YAML-backed limiters
### Migration
- `MigrateYAMLToSQLite(yamlPath, sqlitePath string) error` — one-shot migration
helper that reads an existing YAML state file and writes all quotas and usage
state to a new SQLite database
### SQLite connection settings
- `PRAGMA journal_mode=WAL` — enables concurrent reads alongside a single writer
- `PRAGMA busy_timeout=5000` — 5-second wait on lock contention before returning an error
- `db.SetMaxOpenConns(1)` — single connection for PRAGMA consistency
### Tests added (sqlite_test.go)
- `TestNewSQLiteStore_Good / _Bad` — creation and invalid path handling
- `TestSQLiteQuotasRoundTrip_Good` — save/load round-trip
- `TestSQLite_QuotasOverwrite_Good` — the latest quota snapshot replaces previous rows
- `TestSQLiteStateRoundTrip_Good` — multi-model state with nanosecond precision
- `TestSQLiteStateOverwrite_Good` — delete-then-insert atomicity
- `TestSQLiteEmptyState_Good` — fresh database returns empty maps
- `TestNewWithSQLite_Good / TestNewWithSQLiteConfig_Good` — constructor tests
- `TestSQLitePersistAndLoad_Good` — full persist + reload cycle
- `TestSQLitePersistMultipleModels_Good` — multi-provider persistence
- `TestSQLiteConcurrent_Good` — 10 goroutines x 20 ops, race-clean
- `TestYAMLBackwardCompat_Good` — existing YAML tests pass unchanged
- `TestMigrateYAMLToSQLite_Good / _Bad` — migration round-trip and error paths
- `TestSQLiteCorruptDB_Ugly / TestSQLiteTruncatedDB_Ugly` — graceful corrupt DB recovery
- `TestSQLiteEndToEnd_Good` — full two-session scenario
---
## Phase 3 — Integration (Planned)
Not yet implemented. Intended downstream integrations:
- Wire into `go-ml` backends so rate limiting is enforced automatically on
inference calls without caller involvement
- Wire into the `go-ai` facade so all providers share a single rate limit layer
- Export metrics (requests/minute, tokens/minute, rejection counts) for
monitoring dashboards
---
## Known Limitations
**CountTokens URL is hardcoded.** The exported `CountTokens` helper calls
`generativelanguage.googleapis.com` directly. Callers cannot redirect it to
Gemini-compatible proxies or alternate endpoints without going through an
internal helper or refactoring the API to accept a base URL or `http.Client`.
**saveState is a full table replace.** On every `Persist()` call, the `requests`,
`tokens`, and `daily` tables are truncated and rewritten. For a limiter tracking
many models with high RPM, this means writing hundreds of rows on every persist
call. A future optimisation would use incremental writes (insert-only, with
periodic vacuuming of expired rows).
**No TTL on SQLite rows.** Historical rows older than one minute are pruned from
the in-memory `UsageStats` on every operation but are written wholesale to
SQLite on `Persist()`. The database does not grow unboundedly between persist
cycles because `saveState` replaces all rows, but if `Persist()` is called
frequently the WAL file can grow transiently.
**WaitForCapacity now sleeps using `Decide`s `RetryAfter` hint** (with a
one-second fallback when no hint exists). This reduces busy looping on long
windows but remains coarse for sub-second smoothing; callers that need
sub-second pacing should implement their own loop.
**No automatic persistence.** `Persist()` must be called explicitly. If a
process exits without calling `Persist()`, any usage recorded since the last
persist is lost. Callers are responsible for calling `Persist()` at appropriate
intervals (e.g., after each `RecordUsage()` call, or on a ticker).