8.3 KiB
Project History
Origin
go-ratelimit was extracted from the pkg/ratelimit package inside
forge.lthn.ai/core/go on 19 February 2026. The package now lives at
dappco.re/go/core/go-ratelimit, with its own repository and independent
development cadence.
Initial commit: fa1a6fc — feat: extract go-ratelimit from core/go pkg/ratelimit
At extraction the package implemented:
- Sliding window rate limiter with 1-minute window
- Daily request caps per model
- Token counting via Google
CountTokensAPI - Hardcoded Gemini quota defaults (
gemini-3-pro-preview: 150 RPM / 1M TPM / 1000 RPD) - YAML persistence to
~/.core/ratelimits.yaml - Single test file with basic sliding window and quota enforcement tests
Phase 0 — Hardening and Test Coverage
Commit: 3c63b10 — feat(ratelimit): generalise beyond Gemini with provider profiles and push coverage to 95%
Supplementary commit: db958f2 — test: expand race coverage and benchmarks
Coverage increased from 77.1% to above the 95% floor. The test suite was rewritten using testify with table-driven subtests throughout.
Tests added
TestCanSend— boundary conditions at exact RPM, TPM, and RPD limits; RPM-only and TPM-only quotas; zero-token estimates; unknown and unlimited modelsTestPrune— pruning of old entries, retention of recent entries, daily reset at 24-hour boundary, no-op on non-existent model, boundary-exact timestampsTestRecordUsage— fresh state, accumulation, insertion into existing stateTestReset— single model, all models (empty string argument), non-existent modelTestWaitForCapacity— context cancellation, pre-cancelled context, immediate capacity, unknown modelTestStats/TestAllStats— known, unknown, and quota-only models; pruning and daily reset insideAllStats()TestPersistAndLoad— round-trip, missing file, corrupt YAML, unreadable file, nested directory creation, unwritable directoryTestConcurrentAccess— 20 goroutines x 50 ops each (CanSend + RecordUsage + Stats)TestConcurrentResetAndRecord— concurrent Reset + RecordUsage + AllStatsTestConcurrentMultipleModels— 5 models, concurrent accessTestConcurrentPersistAndLoad— filesystem race between Persist and LoadTestConcurrentWaitForCapacityAndRecordUsage— WaitForCapacity racing RecordUsage
Benchmarks added
BenchmarkCanSend— 1,000-entry sliding windowBenchmarkRecordUsageBenchmarkCanSendConcurrent— parallel goroutinesBenchmarkCanSendWithPrune— 500 old + 500 new entriesBenchmarkStats— 1,000-entry windowBenchmarkAllStats— 5 models x 200 entriesBenchmarkPersist— YAML I/O
go test -race ./... passed clean. go vet ./... produced no warnings.
Phase 1 — Generalisation Beyond Gemini
Commit: 3c63b10 — included in the same commit as Phase 0
The hardcoded Gemini quotas in New() were replaced with a provider-agnostic
configuration system without breaking the existing API.
New types and functions
Providerstring type with constants:ProviderGemini,ProviderOpenAI,ProviderAnthropic,ProviderLocalProviderProfile— bundles a provider identifier with its model quota mapConfig— construction configuration acceptingFilePath,Backend,Providers, andQuotasfieldsDefaultProfiles()— returns fresh pre-configured profiles for all four providersNewWithConfig(Config)— creates a limiter from explicit configurationSetQuota(model, quota)— runtime quota modification, mutex-protectedAddProvider(provider)— loads all default quotas for a provider at runtime, additive, mutex-protected
Backward compatibility
New() delegates to NewWithConfig(Config{Providers: []Provider{ProviderGemini}}).
TestNewBackwardCompatibility asserts exact parity with the original hardcoded
values. No existing call sites required modification.
Design decision: merge-on-top
Explicit Config.Quotas override provider profile defaults. This allows callers
to use a provider profile for most models while customising specific model limits
without forking the entire profile.
Phase 2 — SQLite Persistent State
Commit: 1afb1d6 — feat(persist): Phase 2 — SQLite backend with WAL mode
The YAML backend serialises the full state on every Persist() call and is
not safe for concurrent multi-process access. Phase 2 added a SQLite backend
using modernc.org/sqlite (pure Go, no CGO) following the go-store pattern
established elsewhere in the ecosystem.
New constructors
NewWithSQLite(dbPath string)— SQLite-backed limiter with Gemini defaultsNewWithSQLiteConfig(dbPath string, cfg Config)— SQLite-backed with custom configClose() error— releases the database connection; no-op on YAML-backed limiters
Migration
MigrateYAMLToSQLite(yamlPath, sqlitePath string) error— one-shot migration helper that reads an existing YAML state file and writes all quotas and usage state to a new SQLite database
SQLite connection settings
PRAGMA journal_mode=WAL— enables concurrent reads alongside a single writerPRAGMA busy_timeout=5000— 5-second wait on lock contention before returning an errordb.SetMaxOpenConns(1)— single connection for PRAGMA consistency
Tests added (sqlite_test.go)
TestNewSQLiteStore_Good / _Bad— creation and invalid path handlingTestSQLiteQuotasRoundTrip_Good— save/load round-tripTestSQLite_QuotasOverwrite_Good— the latest quota snapshot replaces previous rowsTestSQLiteStateRoundTrip_Good— multi-model state with nanosecond precisionTestSQLiteStateOverwrite_Good— delete-then-insert atomicityTestSQLiteEmptyState_Good— fresh database returns empty mapsTestNewWithSQLite_Good / TestNewWithSQLiteConfig_Good— constructor testsTestSQLitePersistAndLoad_Good— full persist + reload cycleTestSQLitePersistMultipleModels_Good— multi-provider persistenceTestSQLiteConcurrent_Good— 10 goroutines x 20 ops, race-cleanTestYAMLBackwardCompat_Good— existing YAML tests pass unchangedTestMigrateYAMLToSQLite_Good / _Bad— migration round-trip and error pathsTestSQLiteCorruptDB_Ugly / TestSQLiteTruncatedDB_Ugly— graceful corrupt DB recoveryTestSQLiteEndToEnd_Good— full two-session scenario
Phase 3 — Integration (Planned)
Not yet implemented. Intended downstream integrations:
- Wire into
go-mlbackends so rate limiting is enforced automatically on inference calls without caller involvement - Wire into the
go-aifacade so all providers share a single rate limit layer - Export metrics (requests/minute, tokens/minute, rejection counts) for monitoring dashboards
Known Limitations
CountTokens URL is hardcoded. The exported CountTokens helper calls
generativelanguage.googleapis.com directly. Callers cannot redirect it to
Gemini-compatible proxies or alternate endpoints without going through an
internal helper or refactoring the API to accept a base URL or http.Client.
saveState is a full table replace. On every Persist() call, the requests,
tokens, and daily tables are truncated and rewritten. For a limiter tracking
many models with high RPM, this means writing hundreds of rows on every persist
call. A future optimisation would use incremental writes (insert-only, with
periodic vacuuming of expired rows).
No TTL on SQLite rows. Historical rows older than one minute are pruned from
the in-memory UsageStats on every operation but are written wholesale to
SQLite on Persist(). The database does not grow unboundedly between persist
cycles because saveState replaces all rows, but if Persist() is called
frequently the WAL file can grow transiently.
WaitForCapacity now sleeps using Decide’s RetryAfter hint (with a
one-second fallback when no hint exists). This reduces busy looping on long
windows but remains coarse for sub-second smoothing; callers that need
sub-second pacing should implement their own loop.
No automatic persistence. Persist() must be called explicitly. If a
process exits without calling Persist(), any usage recorded since the last
persist is lost. Callers are responsible for calling Persist() at appropriate
intervals (e.g., after each RecordUsage() call, or on a ticker).