# Project History ## Origin go-ratelimit was extracted from the `pkg/ratelimit` package inside `forge.lthn.ai/core/go` on 19 February 2026. The extraction gave the package its own module path, repository, and independent development cadence. Initial commit: `fa1a6fc` — `feat: extract go-ratelimit from core/go pkg/ratelimit` At extraction the package implemented: - Sliding window rate limiter with 1-minute window - Daily request caps per model - Token counting via Google `CountTokens` API - Hardcoded Gemini quota defaults (`gemini-3-pro-preview`: 150 RPM / 1M TPM / 1000 RPD) - YAML persistence to `~/.core/ratelimits.yaml` - Single test file with basic sliding window and quota enforcement tests --- ## Phase 0 — Hardening and Test Coverage Commit: `3c63b10` — `feat(ratelimit): generalise beyond Gemini with provider profiles and push coverage to 95%` Supplementary commit: `db958f2` — `test: expand race coverage and benchmarks` Coverage increased from 77.1% to 95.1%. The test suite was rewritten using testify with table-driven subtests throughout. ### Tests added - `TestCanSend` — boundary conditions at exact RPM, TPM, and RPD limits; RPM-only and TPM-only quotas; zero-token estimates; unknown and unlimited models - `TestPrune` — pruning of old entries, retention of recent entries, daily reset at 24-hour boundary, no-op on non-existent model, boundary-exact timestamps - `TestRecordUsage` — fresh state, accumulation, insertion into existing state - `TestReset` — single model, all models (empty string argument), non-existent model - `TestWaitForCapacity` — context cancellation, pre-cancelled context, immediate capacity, unknown model - `TestStats` / `TestAllStats` — known, unknown, and quota-only models; pruning and daily reset inside `AllStats()` - `TestPersistAndLoad` — round-trip, missing file, corrupt YAML, unreadable file, nested directory creation, unwritable directory - `TestConcurrentAccess` — 20 goroutines x 50 ops each (CanSend + RecordUsage + Stats) - `TestConcurrentResetAndRecord` — concurrent Reset + RecordUsage + AllStats - `TestConcurrentMultipleModels` — 5 models, concurrent access - `TestConcurrentPersistAndLoad` — filesystem race between Persist and Load - `TestConcurrentWaitForCapacityAndRecordUsage` — WaitForCapacity racing RecordUsage ### Benchmarks added - `BenchmarkCanSend` — 1,000-entry sliding window - `BenchmarkRecordUsage` - `BenchmarkCanSendConcurrent` — parallel goroutines - `BenchmarkCanSendWithPrune` — 500 old + 500 new entries - `BenchmarkStats` — 1,000-entry window - `BenchmarkAllStats` — 5 models x 200 entries - `BenchmarkPersist` — YAML I/O ### Remaining uncovered paths (5%) These three paths are structurally impossible to cover in unit tests without modifying production code: 1. `CountTokens` success path — the Google API URL is hardcoded; unit tests cannot intercept the HTTP call without URL injection support 2. `yaml.Marshal` error path in `Persist()` — `yaml.Marshal` does not fail on valid Go structs; the error branch exists for correctness only 3. `os.UserHomeDir()` error path in `NewWithConfig()` — triggered only when `$HOME` is unset, which test infrastructure prevents `go test -race ./...` passed clean. `go vet ./...` produced no warnings. --- ## Phase 1 — Generalisation Beyond Gemini Commit: `3c63b10` — included in the same commit as Phase 0 The hardcoded Gemini quotas in `New()` were replaced with a provider-agnostic configuration system without breaking the existing API. ### New types and functions - `Provider` string type with constants: `ProviderGemini`, `ProviderOpenAI`, `ProviderAnthropic`, `ProviderLocal` - `ProviderProfile` — bundles a provider identifier with its model quota map - `Config` — construction configuration accepting `FilePath`, `Backend`, `Providers`, and `Quotas` fields - `DefaultProfiles()` — returns fresh pre-configured profiles for all four providers - `NewWithConfig(Config)` — creates a limiter from explicit configuration - `SetQuota(model, quota)` — runtime quota modification, mutex-protected - `AddProvider(provider)` — loads all default quotas for a provider at runtime, additive, mutex-protected ### Backward compatibility `New()` delegates to `NewWithConfig(Config{Providers: []Provider{ProviderGemini}})`. `TestNewBackwardCompatibility` asserts exact parity with the original hardcoded values. No existing call sites required modification. ### Design decision: merge-on-top Explicit `Config.Quotas` override provider profile defaults. This allows callers to use a provider profile for most models while customising specific model limits without forking the entire profile. --- ## Phase 2 — SQLite Persistent State Commit: `1afb1d6` — `feat(persist): Phase 2 — SQLite backend with WAL mode` The YAML backend serialises the full state on every `Persist()` call and is not safe for concurrent multi-process access. Phase 2 added a SQLite backend using `modernc.org/sqlite` (pure Go, no CGO) following the go-store pattern established elsewhere in the ecosystem. ### New constructors - `NewWithSQLite(dbPath string)` — SQLite-backed limiter with Gemini defaults - `NewWithSQLiteConfig(dbPath string, cfg Config)` — SQLite-backed with custom config - `Close() error` — releases the database connection; no-op on YAML-backed limiters ### Migration - `MigrateYAMLToSQLite(yamlPath, sqlitePath string) error` — one-shot migration helper that reads an existing YAML state file and writes all quotas and usage state to a new SQLite database ### SQLite connection settings - `PRAGMA journal_mode=WAL` — enables concurrent reads alongside a single writer - `PRAGMA busy_timeout=5000` — 5-second wait on lock contention before returning an error - `db.SetMaxOpenConns(1)` — single connection for PRAGMA consistency ### Tests added (sqlite_test.go) - `TestNewSQLiteStore_Good / _Bad` — creation and invalid path handling - `TestSQLiteQuotasRoundTrip_Good` — save/load round-trip - `TestSQLiteQuotasUpsert_Good` — upsert replaces existing rows - `TestSQLiteStateRoundTrip_Good` — multi-model state with nanosecond precision - `TestSQLiteStateOverwrite_Good` — delete-then-insert atomicity - `TestSQLiteEmptyState_Good` — fresh database returns empty maps - `TestNewWithSQLite_Good / TestNewWithSQLiteConfig_Good` — constructor tests - `TestSQLitePersistAndLoad_Good` — full persist + reload cycle - `TestSQLitePersistMultipleModels_Good` — multi-provider persistence - `TestSQLiteConcurrent_Good` — 10 goroutines x 20 ops, race-clean - `TestYAMLBackwardCompat_Good` — existing YAML tests pass unchanged - `TestMigrateYAMLToSQLite_Good / _Bad` — migration round-trip and error paths - `TestSQLiteCorruptDB_Ugly / TestSQLiteTruncatedDB_Ugly` — graceful corrupt DB recovery - `TestSQLiteEndToEnd_Good` — full two-session scenario --- ## Phase 3 — Integration (Planned) Not yet implemented. Intended downstream integrations: - Wire into `go-ml` backends so rate limiting is enforced automatically on inference calls without caller involvement - Wire into the `go-ai` facade so all providers share a single rate limit layer - Export metrics (requests/minute, tokens/minute, rejection counts) for monitoring dashboards --- ## Known Limitations **CountTokens URL is hardcoded.** The `CountTokens` helper calls `generativelanguage.googleapis.com` directly. There is no way to override the base URL, which prevents testing the success path in unit tests and prevents use with Gemini-compatible proxies. A future refactor would accept a base URL parameter or an `http.Client`. **saveState is a full table replace.** On every `Persist()` call, the `requests`, `tokens`, and `daily` tables are truncated and rewritten. For a limiter tracking many models with high RPM, this means writing hundreds of rows on every persist call. A future optimisation would use incremental writes (insert-only, with periodic vacuuming of expired rows). **No TTL on SQLite rows.** Historical rows older than one minute are pruned from the in-memory `UsageStats` on every operation but are written wholesale to SQLite on `Persist()`. The database does not grow unboundedly between persist cycles because `saveState` replaces all rows, but if `Persist()` is called frequently the WAL file can grow transiently. **WaitForCapacity polling interval is fixed at 1 second.** This is appropriate for RPM-scale limits but is coarse for sub-second limits. If a caller needs finer-grained waiting (e.g., smoothing requests within a minute), they must implement their own loop. **No automatic persistence.** `Persist()` must be called explicitly. If a process exits without calling `Persist()`, any usage recorded since the last persist is lost. Callers are responsible for calling `Persist()` at appropriate intervals (e.g., after each `RecordUsage()` call, or on a ticker).