go-ratelimit/docs/history.md

<!-- SPDX-License-Identifier: EUPL-1.2 -->

# Project History

## Origin

go-ratelimit was extracted from the `pkg/ratelimit` package inside
`forge.lthn.ai/core/go` on 19 February 2026. The package now lives at
`dappco.re/go/core/go-ratelimit`, with its own repository and independent
development cadence.

Initial commit: `fa1a6fc` — `feat: extract go-ratelimit from core/go pkg/ratelimit`

At extraction the package implemented:

- Sliding window rate limiter with 1-minute window
- Daily request caps per model
- Token counting via Google `CountTokens` API
- Hardcoded Gemini quota defaults (`gemini-3-pro-preview`: 150 RPM / 1M TPM / 1000 RPD)
- YAML persistence to `~/.core/ratelimits.yaml`
- Single test file with basic sliding window and quota enforcement tests

---

## Phase 0 — Hardening and Test Coverage

Commit: `3c63b10` — `feat(ratelimit): generalise beyond Gemini with provider profiles and push coverage to 95%`

Supplementary commit: `db958f2` — `test: expand race coverage and benchmarks`

Coverage increased from 77.1% to above the 95% floor. The test suite was rewritten using
testify with table-driven subtests throughout.

### Tests added

- `TestCanSend` — boundary conditions at exact RPM, TPM, and RPD limits;
  RPM-only and TPM-only quotas; zero-token estimates; unknown and unlimited models
- `TestPrune` — pruning of old entries, retention of recent entries, daily reset
  at 24-hour boundary, no-op on non-existent model, boundary-exact timestamps
- `TestRecordUsage` — fresh state, accumulation, insertion into existing state
- `TestReset` — single model, all models (empty string argument), non-existent model
- `TestWaitForCapacity` — context cancellation, pre-cancelled context,
  immediate capacity, unknown model
- `TestStats` / `TestAllStats` — known, unknown, and quota-only models; pruning
  and daily reset inside `AllStats()`
- `TestPersistAndLoad` — round-trip, missing file, corrupt YAML, unreadable file,
  nested directory creation, unwritable directory
- `TestConcurrentAccess` — 20 goroutines x 50 ops each (CanSend + RecordUsage + Stats)
- `TestConcurrentResetAndRecord` — concurrent Reset + RecordUsage + AllStats
- `TestConcurrentMultipleModels` — 5 models, concurrent access
- `TestConcurrentPersistAndLoad` — filesystem race between Persist and Load
- `TestConcurrentWaitForCapacityAndRecordUsage` — WaitForCapacity racing RecordUsage

### Benchmarks added

- `BenchmarkCanSend` — 1,000-entry sliding window
- `BenchmarkRecordUsage`
- `BenchmarkCanSendConcurrent` — parallel goroutines
- `BenchmarkCanSendWithPrune` — 500 old + 500 new entries
- `BenchmarkStats` — 1,000-entry window
- `BenchmarkAllStats` — 5 models x 200 entries
- `BenchmarkPersist` — YAML I/O

`go test -race ./...` passed clean. `go vet ./...` produced no warnings.

---

## Phase 1 — Generalisation Beyond Gemini

Commit: `3c63b10` — included in the same commit as Phase 0

The hardcoded Gemini quotas in `New()` were replaced with a provider-agnostic
configuration system without breaking the existing API.

### New types and functions

- `Provider` string type with constants: `ProviderGemini`, `ProviderOpenAI`,
  `ProviderAnthropic`, `ProviderLocal`
- `ProviderProfile` — bundles a provider identifier with its model quota map
- `Config` — construction configuration accepting `FilePath`, `Backend`,
  `Providers`, and `Quotas` fields
- `DefaultProfiles()` — returns fresh pre-configured profiles for all four providers
- `NewWithConfig(Config)` — creates a limiter from explicit configuration
- `SetQuota(model, quota)` — runtime quota modification, mutex-protected
- `AddProvider(provider)` — loads all default quotas for a provider at runtime,
  additive, mutex-protected

### Backward compatibility

`New()` delegates to `NewWithConfig(Config{Providers: []Provider{ProviderGemini}})`.
`TestNewBackwardCompatibility` asserts exact parity with the original hardcoded
values. No existing call sites required modification.

### Design decision: merge-on-top

Explicit `Config.Quotas` override provider profile defaults. This allows callers
to use a provider profile for most models while customising specific model limits
without forking the entire profile.

---

## Phase 2 — SQLite Persistent State

Commit: `1afb1d6` — `feat(persist): Phase 2 — SQLite backend with WAL mode`

The YAML backend serialises the full state on every `Persist()` call and is
not safe for concurrent multi-process access. Phase 2 added a SQLite backend
using `modernc.org/sqlite` (pure Go, no CGO) following the go-store pattern
established elsewhere in the ecosystem.

### New constructors

- `NewWithSQLite(dbPath string)` — SQLite-backed limiter with Gemini defaults
- `NewWithSQLiteConfig(dbPath string, cfg Config)` — SQLite-backed with custom config
- `Close() error` — releases the database connection; no-op on YAML-backed limiters

### Migration

- `MigrateYAMLToSQLite(yamlPath, sqlitePath string) error` — one-shot migration
  helper that reads an existing YAML state file and writes all quotas and usage
  state to a new SQLite database

### SQLite connection settings

- `PRAGMA journal_mode=WAL` — enables concurrent reads alongside a single writer
- `PRAGMA busy_timeout=5000` — 5-second wait on lock contention before returning an error
- `db.SetMaxOpenConns(1)` — single connection for PRAGMA consistency

### Tests added (sqlite_test.go)

- `TestNewSQLiteStore_Good / _Bad` — creation and invalid path handling
- `TestSQLiteQuotasRoundTrip_Good` — save/load round-trip
- `TestSQLite_QuotasOverwrite_Good` — the latest quota snapshot replaces previous rows
- `TestSQLiteStateRoundTrip_Good` — multi-model state with nanosecond precision
- `TestSQLiteStateOverwrite_Good` — delete-then-insert atomicity
- `TestSQLiteEmptyState_Good` — fresh database returns empty maps
- `TestNewWithSQLite_Good / TestNewWithSQLiteConfig_Good` — constructor tests
- `TestSQLitePersistAndLoad_Good` — full persist + reload cycle
- `TestSQLitePersistMultipleModels_Good` — multi-provider persistence
- `TestSQLiteConcurrent_Good` — 10 goroutines x 20 ops, race-clean
- `TestYAMLBackwardCompat_Good` — existing YAML tests pass unchanged
- `TestMigrateYAMLToSQLite_Good / _Bad` — migration round-trip and error paths
- `TestSQLiteCorruptDB_Ugly / TestSQLiteTruncatedDB_Ugly` — graceful corrupt DB recovery
- `TestSQLiteEndToEnd_Good` — full two-session scenario

---

## Phase 3 — Integration (Planned)

Not yet implemented. Intended downstream integrations:

- Wire into `go-ml` backends so rate limiting is enforced automatically on
  inference calls without caller involvement
- Wire into the `go-ai` facade so all providers share a single rate limit layer
- Export metrics (requests/minute, tokens/minute, rejection counts) for
  monitoring dashboards

---

## Known Limitations

**CountTokens URL is hardcoded.** The exported `CountTokens` helper calls
`generativelanguage.googleapis.com` directly. Callers cannot redirect it to
Gemini-compatible proxies or alternate endpoints without going through an
internal helper or refactoring the API to accept a base URL or `http.Client`.

**saveState is a full table replace.** On every `Persist()` call, the `requests`,
`tokens`, and `daily` tables are truncated and rewritten. For a limiter tracking
many models with high RPM, this means writing hundreds of rows on every persist
call. A future optimisation would use incremental writes (insert-only, with
periodic vacuuming of expired rows).

**No TTL on SQLite rows.** Historical rows older than one minute are pruned from
the in-memory `UsageStats` on every operation but are written wholesale to
SQLite on `Persist()`. The database does not grow unboundedly between persist
cycles because `saveState` replaces all rows, but if `Persist()` is called
frequently the WAL file can grow transiently.

**WaitForCapacity polling interval is fixed at 1 second.** This is appropriate
for RPM-scale limits but is coarse for sub-second limits. If a caller needs
finer-grained waiting (e.g., smoothing requests within a minute), they must
implement their own loop.

**No automatic persistence.** `Persist()` must be called explicitly. If a
process exits without calling `Persist()`, any usage recorded since the last
persist is lost. Callers are responsible for calling `Persist()` at appropriate
intervals (e.g., after each `RecordUsage()` call, or on a ticker).
fix(ratelimit): align module metadata and repo guidance Co-Authored-By: Virgil <virgil@lethean.io> 2026-03-27 04:23:34 +00:00			`<!-- SPDX-License-Identifier: EUPL-1.2 -->`

docs: graduate TODO/FINDINGS into production documentation Replace internal task tracking (TODO.md, FINDINGS.md) with structured documentation in docs/. Trim CLAUDE.md to agent instructions only. Co-Authored-By: Virgil <virgil@lethean.io> 2026-02-20 15:01:55 +00:00			`# Project History`

			`## Origin`

			go-ratelimit was extracted from the `pkg/ratelimit` package inside
fix(ratelimit): align module metadata and repo guidance Co-Authored-By: Virgil <virgil@lethean.io> 2026-03-27 04:23:34 +00:00			`forge.lthn.ai/core/go` on 19 February 2026. The package now lives at
			`dappco.re/go/core/go-ratelimit`, with its own repository and independent
			`development cadence.`
docs: graduate TODO/FINDINGS into production documentation Replace internal task tracking (TODO.md, FINDINGS.md) with structured documentation in docs/. Trim CLAUDE.md to agent instructions only. Co-Authored-By: Virgil <virgil@lethean.io> 2026-02-20 15:01:55 +00:00
			Initial commit: `fa1a6fc` — `feat: extract go-ratelimit from core/go pkg/ratelimit`

			`At extraction the package implemented:`

			`- Sliding window rate limiter with 1-minute window`
			`- Daily request caps per model`
			- Token counting via Google `CountTokens` API
			- Hardcoded Gemini quota defaults (`gemini-3-pro-preview`: 150 RPM / 1M TPM / 1000 RPD)
			- YAML persistence to `~/.core/ratelimits.yaml`
			`- Single test file with basic sliding window and quota enforcement tests`

			`---`

			`## Phase 0 — Hardening and Test Coverage`

			Commit: `3c63b10` — `feat(ratelimit): generalise beyond Gemini with provider profiles and push coverage to 95%`

			Supplementary commit: `db958f2` — `test: expand race coverage and benchmarks`

fix(ratelimit): align module metadata and repo guidance Co-Authored-By: Virgil <virgil@lethean.io> 2026-03-27 04:23:34 +00:00			`Coverage increased from 77.1% to above the 95% floor. The test suite was rewritten using`
docs: graduate TODO/FINDINGS into production documentation Replace internal task tracking (TODO.md, FINDINGS.md) with structured documentation in docs/. Trim CLAUDE.md to agent instructions only. Co-Authored-By: Virgil <virgil@lethean.io> 2026-02-20 15:01:55 +00:00			`testify with table-driven subtests throughout.`

			`### Tests added`

			- `TestCanSend` — boundary conditions at exact RPM, TPM, and RPD limits;
			`RPM-only and TPM-only quotas; zero-token estimates; unknown and unlimited models`
			- `TestPrune` — pruning of old entries, retention of recent entries, daily reset
			`at 24-hour boundary, no-op on non-existent model, boundary-exact timestamps`
			- `TestRecordUsage` — fresh state, accumulation, insertion into existing state
			- `TestReset` — single model, all models (empty string argument), non-existent model
			- `TestWaitForCapacity` — context cancellation, pre-cancelled context,
			`immediate capacity, unknown model`
			- `TestStats` / `TestAllStats` — known, unknown, and quota-only models; pruning
			and daily reset inside `AllStats()`
			- `TestPersistAndLoad` — round-trip, missing file, corrupt YAML, unreadable file,
			`nested directory creation, unwritable directory`
			- `TestConcurrentAccess` — 20 goroutines x 50 ops each (CanSend + RecordUsage + Stats)
			- `TestConcurrentResetAndRecord` — concurrent Reset + RecordUsage + AllStats
			- `TestConcurrentMultipleModels` — 5 models, concurrent access
			- `TestConcurrentPersistAndLoad` — filesystem race between Persist and Load
			- `TestConcurrentWaitForCapacityAndRecordUsage` — WaitForCapacity racing RecordUsage

			`### Benchmarks added`

			- `BenchmarkCanSend` — 1,000-entry sliding window
			- `BenchmarkRecordUsage`
			- `BenchmarkCanSendConcurrent` — parallel goroutines
			- `BenchmarkCanSendWithPrune` — 500 old + 500 new entries
			- `BenchmarkStats` — 1,000-entry window
			- `BenchmarkAllStats` — 5 models x 200 entries
			- `BenchmarkPersist` — YAML I/O

			`go test -race ./...` passed clean. `go vet ./...` produced no warnings.

			`---`

			`## Phase 1 — Generalisation Beyond Gemini`

			Commit: `3c63b10` — included in the same commit as Phase 0

			The hardcoded Gemini quotas in `New()` were replaced with a provider-agnostic
			`configuration system without breaking the existing API.`

			`### New types and functions`

			- `Provider` string type with constants: `ProviderGemini`, `ProviderOpenAI`,
			`ProviderAnthropic`, `ProviderLocal`
			- `ProviderProfile` — bundles a provider identifier with its model quota map
			- `Config` — construction configuration accepting `FilePath`, `Backend`,
			`Providers`, and `Quotas` fields
			- `DefaultProfiles()` — returns fresh pre-configured profiles for all four providers
			- `NewWithConfig(Config)` — creates a limiter from explicit configuration
			- `SetQuota(model, quota)` — runtime quota modification, mutex-protected
			- `AddProvider(provider)` — loads all default quotas for a provider at runtime,
			`additive, mutex-protected`

			`### Backward compatibility`

			`New()` delegates to `NewWithConfig(Config{Providers: []Provider{ProviderGemini}})`.
			`TestNewBackwardCompatibility` asserts exact parity with the original hardcoded
			`values. No existing call sites required modification.`

			`### Design decision: merge-on-top`

			Explicit `Config.Quotas` override provider profile defaults. This allows callers
			`to use a provider profile for most models while customising specific model limits`
			`without forking the entire profile.`

			`---`

			`## Phase 2 — SQLite Persistent State`

			Commit: `1afb1d6` — `feat(persist): Phase 2 — SQLite backend with WAL mode`

			The YAML backend serialises the full state on every `Persist()` call and is
			`not safe for concurrent multi-process access. Phase 2 added a SQLite backend`
			using `modernc.org/sqlite` (pure Go, no CGO) following the go-store pattern
			`established elsewhere in the ecosystem.`

			`### New constructors`

			- `NewWithSQLite(dbPath string)` — SQLite-backed limiter with Gemini defaults
			- `NewWithSQLiteConfig(dbPath string, cfg Config)` — SQLite-backed with custom config
			- `Close() error` — releases the database connection; no-op on YAML-backed limiters

			`### Migration`

			- `MigrateYAMLToSQLite(yamlPath, sqlitePath string) error` — one-shot migration
			`helper that reads an existing YAML state file and writes all quotas and usage`
			`state to a new SQLite database`

			`### SQLite connection settings`

			- `PRAGMA journal_mode=WAL` — enables concurrent reads alongside a single writer
			- `PRAGMA busy_timeout=5000` — 5-second wait on lock contention before returning an error
			- `db.SetMaxOpenConns(1)` — single connection for PRAGMA consistency

			`### Tests added (sqlite_test.go)`

			- `TestNewSQLiteStore_Good / _Bad` — creation and invalid path handling
			- `TestSQLiteQuotasRoundTrip_Good` — save/load round-trip
fix(ratelimit): align module metadata and repo guidance Co-Authored-By: Virgil <virgil@lethean.io> 2026-03-27 04:23:34 +00:00			- `TestSQLite_QuotasOverwrite_Good` — the latest quota snapshot replaces previous rows
docs: graduate TODO/FINDINGS into production documentation Replace internal task tracking (TODO.md, FINDINGS.md) with structured documentation in docs/. Trim CLAUDE.md to agent instructions only. Co-Authored-By: Virgil <virgil@lethean.io> 2026-02-20 15:01:55 +00:00			- `TestSQLiteStateRoundTrip_Good` — multi-model state with nanosecond precision
			- `TestSQLiteStateOverwrite_Good` — delete-then-insert atomicity
			- `TestSQLiteEmptyState_Good` — fresh database returns empty maps
			- `TestNewWithSQLite_Good / TestNewWithSQLiteConfig_Good` — constructor tests
			- `TestSQLitePersistAndLoad_Good` — full persist + reload cycle
			- `TestSQLitePersistMultipleModels_Good` — multi-provider persistence
			- `TestSQLiteConcurrent_Good` — 10 goroutines x 20 ops, race-clean
			- `TestYAMLBackwardCompat_Good` — existing YAML tests pass unchanged
			- `TestMigrateYAMLToSQLite_Good / _Bad` — migration round-trip and error paths
			- `TestSQLiteCorruptDB_Ugly / TestSQLiteTruncatedDB_Ugly` — graceful corrupt DB recovery
			- `TestSQLiteEndToEnd_Good` — full two-session scenario

			`---`

			`## Phase 3 — Integration (Planned)`

			`Not yet implemented. Intended downstream integrations:`

			- Wire into `go-ml` backends so rate limiting is enforced automatically on
			`inference calls without caller involvement`
			- Wire into the `go-ai` facade so all providers share a single rate limit layer
			`- Export metrics (requests/minute, tokens/minute, rejection counts) for`
			`monitoring dashboards`

			`---`

			`## Known Limitations`

fix(ratelimit): align module metadata and repo guidance Co-Authored-By: Virgil <virgil@lethean.io> 2026-03-27 04:23:34 +00:00			CountTokens URL is hardcoded. The exported `CountTokens` helper calls
			`generativelanguage.googleapis.com` directly. Callers cannot redirect it to
			`Gemini-compatible proxies or alternate endpoints without going through an`
			internal helper or refactoring the API to accept a base URL or `http.Client`.
docs: graduate TODO/FINDINGS into production documentation Replace internal task tracking (TODO.md, FINDINGS.md) with structured documentation in docs/. Trim CLAUDE.md to agent instructions only. Co-Authored-By: Virgil <virgil@lethean.io> 2026-02-20 15:01:55 +00:00
			saveState is a full table replace. On every `Persist()` call, the `requests`,
			`tokens`, and `daily` tables are truncated and rewritten. For a limiter tracking
			`many models with high RPM, this means writing hundreds of rows on every persist`
			`call. A future optimisation would use incremental writes (insert-only, with`
			`periodic vacuuming of expired rows).`

			`No TTL on SQLite rows. Historical rows older than one minute are pruned from`
			the in-memory `UsageStats` on every operation but are written wholesale to
			SQLite on `Persist()`. The database does not grow unboundedly between persist
			cycles because `saveState` replaces all rows, but if `Persist()` is called
			`frequently the WAL file can grow transiently.`

			`WaitForCapacity polling interval is fixed at 1 second. This is appropriate`
			`for RPM-scale limits but is coarse for sub-second limits. If a caller needs`
			`finer-grained waiting (e.g., smoothing requests within a minute), they must`
			`implement their own loop.`

			No automatic persistence. `Persist()` must be called explicitly. If a
			process exits without calling `Persist()`, any usage recorded since the last
			persist is lost. Callers are responsible for calling `Persist()` at appropriate
			intervals (e.g., after each `RecordUsage()` call, or on a ticker).