go-ratelimit/docs/history.md

<!-- SPDX-License-Identifier: EUPL-1.2 -->

# Project History

## Origin

go-ratelimit was extracted from the `pkg/ratelimit` package inside
`forge.lthn.ai/core/go` on 19 February 2026. The package now lives at
`dappco.re/go/core/go-ratelimit`, with its own repository and independent
development cadence.

Initial commit: `fa1a6fc` — `feat: extract go-ratelimit from core/go pkg/ratelimit`

At extraction the package implemented:

- Sliding window rate limiter with 1-minute window
- Daily request caps per model
- Token counting via Google `CountTokens` API
- Hardcoded Gemini quota defaults (`gemini-3-pro-preview`: 150 RPM / 1M TPM / 1000 RPD)
- YAML persistence to `~/.core/ratelimits.yaml`
- Single test file with basic sliding window and quota enforcement tests

---

## Phase 0 — Hardening and Test Coverage

Commit: `3c63b10` — `feat(ratelimit): generalise beyond Gemini with provider profiles and push coverage to 95%`

Supplementary commit: `db958f2` — `test: expand race coverage and benchmarks`

Coverage increased from 77.1% to above the 95% floor. The test suite was rewritten using
testify with table-driven subtests throughout.

### Tests added

- `TestCanSend` — boundary conditions at exact RPM, TPM, and RPD limits;
  RPM-only and TPM-only quotas; zero-token estimates; unknown and unlimited models
- `TestPrune` — pruning of old entries, retention of recent entries, daily reset
  at 24-hour boundary, no-op on non-existent model, boundary-exact timestamps
- `TestRecordUsage` — fresh state, accumulation, insertion into existing state
- `TestReset` — single model, all models (empty string argument), non-existent model
- `TestWaitForCapacity` — context cancellation, pre-cancelled context,
  immediate capacity, unknown model
- `TestStats` / `TestAllStats` — known, unknown, and quota-only models; pruning
  and daily reset inside `AllStats()`
- `TestPersistAndLoad` — round-trip, missing file, corrupt YAML, unreadable file,
  nested directory creation, unwritable directory
- `TestConcurrentAccess` — 20 goroutines x 50 ops each (CanSend + RecordUsage + Stats)
- `TestConcurrentResetAndRecord` — concurrent Reset + RecordUsage + AllStats
- `TestConcurrentMultipleModels` — 5 models, concurrent access
- `TestConcurrentPersistAndLoad` — filesystem race between Persist and Load
- `TestConcurrentWaitForCapacityAndRecordUsage` — WaitForCapacity racing RecordUsage

### Benchmarks added

- `BenchmarkCanSend` — 1,000-entry sliding window
- `BenchmarkRecordUsage`
- `BenchmarkCanSendConcurrent` — parallel goroutines
- `BenchmarkCanSendWithPrune` — 500 old + 500 new entries
- `BenchmarkStats` — 1,000-entry window
- `BenchmarkAllStats` — 5 models x 200 entries
- `BenchmarkPersist` — YAML I/O

`go test -race ./...` passed clean. `go vet ./...` produced no warnings.

---

## Phase 1 — Generalisation Beyond Gemini

Commit: `3c63b10` — included in the same commit as Phase 0

The hardcoded Gemini quotas in `New()` were replaced with a provider-agnostic
configuration system without breaking the existing API.

### New types and functions

- `Provider` string type with constants: `ProviderGemini`, `ProviderOpenAI`,
  `ProviderAnthropic`, `ProviderLocal`
- `ProviderProfile` — bundles a provider identifier with its model quota map
- `Config` — construction configuration accepting `FilePath`, `Backend`,
  `Providers`, and `Quotas` fields
- `DefaultProfiles()` — returns fresh pre-configured profiles for all four providers
- `NewWithConfig(Config)` — creates a limiter from explicit configuration
- `SetQuota(model, quota)` — runtime quota modification, mutex-protected
- `AddProvider(provider)` — loads all default quotas for a provider at runtime,
  additive, mutex-protected

### Backward compatibility

`New()` delegates to `NewWithConfig(Config{Providers: []Provider{ProviderGemini}})`.
`TestNewBackwardCompatibility` asserts exact parity with the original hardcoded
values. No existing call sites required modification.

### Design decision: merge-on-top

Explicit `Config.Quotas` override provider profile defaults. This allows callers
to use a provider profile for most models while customising specific model limits
without forking the entire profile.

---

## Phase 2 — SQLite Persistent State

Commit: `1afb1d6` — `feat(persist): Phase 2 — SQLite backend with WAL mode`

The YAML backend serialises the full state on every `Persist()` call and is
not safe for concurrent multi-process access. Phase 2 added a SQLite backend
using `modernc.org/sqlite` (pure Go, no CGO) following the go-store pattern
established elsewhere in the ecosystem.

### New constructors

- `NewWithSQLite(dbPath string)` — SQLite-backed limiter with Gemini defaults
- `NewWithSQLiteConfig(dbPath string, cfg Config)` — SQLite-backed with custom config
- `Close() error` — releases the database connection; no-op on YAML-backed limiters

### Migration

- `MigrateYAMLToSQLite(yamlPath, sqlitePath string) error` — one-shot migration
  helper that reads an existing YAML state file and writes all quotas and usage
  state to a new SQLite database

### SQLite connection settings

- `PRAGMA journal_mode=WAL` — enables concurrent reads alongside a single writer
- `PRAGMA busy_timeout=5000` — 5-second wait on lock contention before returning an error
- `db.SetMaxOpenConns(1)` — single connection for PRAGMA consistency

### Tests added (sqlite_test.go)

- `TestNewSQLiteStore_Good / _Bad` — creation and invalid path handling
- `TestSQLiteQuotasRoundTrip_Good` — save/load round-trip
- `TestSQLite_QuotasOverwrite_Good` — the latest quota snapshot replaces previous rows
- `TestSQLiteStateRoundTrip_Good` — multi-model state with nanosecond precision
- `TestSQLiteStateOverwrite_Good` — delete-then-insert atomicity
- `TestSQLiteEmptyState_Good` — fresh database returns empty maps
- `TestNewWithSQLite_Good / TestNewWithSQLiteConfig_Good` — constructor tests
- `TestSQLitePersistAndLoad_Good` — full persist + reload cycle
- `TestSQLitePersistMultipleModels_Good` — multi-provider persistence
- `TestSQLiteConcurrent_Good` — 10 goroutines x 20 ops, race-clean
- `TestYAMLBackwardCompat_Good` — existing YAML tests pass unchanged
- `TestMigrateYAMLToSQLite_Good / _Bad` — migration round-trip and error paths
- `TestSQLiteCorruptDB_Ugly / TestSQLiteTruncatedDB_Ugly` — graceful corrupt DB recovery
- `TestSQLiteEndToEnd_Good` — full two-session scenario

---

## Phase 3 — Integration (Planned)

Not yet implemented. Intended downstream integrations:

- Wire into `go-ml` backends so rate limiting is enforced automatically on
  inference calls without caller involvement
- Wire into the `go-ai` facade so all providers share a single rate limit layer
- Export metrics (requests/minute, tokens/minute, rejection counts) for
  monitoring dashboards

---

## Known Limitations

**CountTokens URL is hardcoded.** The exported `CountTokens` helper calls
`generativelanguage.googleapis.com` directly. Callers cannot redirect it to
Gemini-compatible proxies or alternate endpoints without going through an
internal helper or refactoring the API to accept a base URL or `http.Client`.

**saveState is a full table replace.** On every `Persist()` call, the `requests`,
`tokens`, and `daily` tables are truncated and rewritten. For a limiter tracking
many models with high RPM, this means writing hundreds of rows on every persist
call. A future optimisation would use incremental writes (insert-only, with
periodic vacuuming of expired rows).

**No TTL on SQLite rows.** Historical rows older than one minute are pruned from
the in-memory `UsageStats` on every operation but are written wholesale to
SQLite on `Persist()`. The database does not grow unboundedly between persist
cycles because `saveState` replaces all rows, but if `Persist()` is called
frequently the WAL file can grow transiently.

**WaitForCapacity now sleeps using `Decide`’s `RetryAfter` hint** (with a
one-second fallback when no hint exists). This reduces busy looping on long
windows but remains coarse for sub-second smoothing; callers that need
sub-second pacing should implement their own loop.

**No automatic persistence.** `Persist()` must be called explicitly. If a
process exits without calling `Persist()`, any usage recorded since the last
persist is lost. Callers are responsible for calling `Persist()` at appropriate
intervals (e.g., after each `RecordUsage()` call, or on a ticker).
-												fix(ratelimit): align module metadata and repo guidance

Co-Authored-By: Virgil <virgil@lethean.io>

											
										
										
											2026-03-27 04:23:34 +00:00
+								<!-- SPDX-License-Identifier: EUPL-1.2 -->
-												docs: graduate TODO/FINDINGS into production documentation

Replace internal task tracking (TODO.md, FINDINGS.md) with structured
documentation in docs/. Trim CLAUDE.md to agent instructions only.

Co-Authored-By: Virgil <virgil@lethean.io>

											
										
										
											2026-02-20 15:01:55 +00:00
+								# Project History
 								## Origin
 								go-ratelimit was extracted from the `pkg/ratelimit` package inside
-												fix(ratelimit): align module metadata and repo guidance

Co-Authored-By: Virgil <virgil@lethean.io>

											
										
										
											2026-03-27 04:23:34 +00:00
+								`forge.lthn.ai/core/go` on 19 February 2026. The package now lives at
 								`dappco.re/go/core/go-ratelimit`, with its own repository and independent
 								development cadence.
-												docs: graduate TODO/FINDINGS into production documentation

Replace internal task tracking (TODO.md, FINDINGS.md) with structured
documentation in docs/. Trim CLAUDE.md to agent instructions only.

Co-Authored-By: Virgil <virgil@lethean.io>

											
										
										
											2026-02-20 15:01:55 +00:00
 								Initial commit: `fa1a6fc` — `feat: extract go-ratelimit from core/go pkg/ratelimit`
 								At extraction the package implemented:
 								- Sliding window rate limiter with 1-minute window
 								- Daily request caps per model
 								- Token counting via Google `CountTokens` API
 								- Hardcoded Gemini quota defaults (`gemini-3-pro-preview`: 150 RPM / 1M TPM / 1000 RPD)
 								- YAML persistence to `~/.core/ratelimits.yaml`
 								- Single test file with basic sliding window and quota enforcement tests
 								---
 								## Phase 0 — Hardening and Test Coverage
 								Commit: `3c63b10` — `feat(ratelimit): generalise beyond Gemini with provider profiles and push coverage to 95%`
 								Supplementary commit: `db958f2` — `test: expand race coverage and benchmarks`
-												fix(ratelimit): align module metadata and repo guidance

Co-Authored-By: Virgil <virgil@lethean.io>

											
										
										
											2026-03-27 04:23:34 +00:00
+								Coverage increased from 77.1% to above the 95% floor. The test suite was rewritten using
-												docs: graduate TODO/FINDINGS into production documentation

Replace internal task tracking (TODO.md, FINDINGS.md) with structured
documentation in docs/. Trim CLAUDE.md to agent instructions only.

Co-Authored-By: Virgil <virgil@lethean.io>

											
										
										
											2026-02-20 15:01:55 +00:00
+								testify with table-driven subtests throughout.
 								### Tests added
 								- `TestCanSend` — boundary conditions at exact RPM, TPM, and RPD limits;
 								  RPM-only and TPM-only quotas; zero-token estimates; unknown and unlimited models
 								- `TestPrune` — pruning of old entries, retention of recent entries, daily reset
 								  at 24-hour boundary, no-op on non-existent model, boundary-exact timestamps
 								- `TestRecordUsage` — fresh state, accumulation, insertion into existing state
 								- `TestReset` — single model, all models (empty string argument), non-existent model
 								- `TestWaitForCapacity` — context cancellation, pre-cancelled context,
 								  immediate capacity, unknown model
 								- `TestStats` / `TestAllStats` — known, unknown, and quota-only models; pruning
 								  and daily reset inside `AllStats()`
 								- `TestPersistAndLoad` — round-trip, missing file, corrupt YAML, unreadable file,
 								  nested directory creation, unwritable directory
 								- `TestConcurrentAccess` — 20 goroutines x 50 ops each (CanSend + RecordUsage + Stats)
 								- `TestConcurrentResetAndRecord` — concurrent Reset + RecordUsage + AllStats
 								- `TestConcurrentMultipleModels` — 5 models, concurrent access
 								- `TestConcurrentPersistAndLoad` — filesystem race between Persist and Load
 								- `TestConcurrentWaitForCapacityAndRecordUsage` — WaitForCapacity racing RecordUsage
 								### Benchmarks added
 								- `BenchmarkCanSend` — 1,000-entry sliding window
 								- `BenchmarkRecordUsage`
 								- `BenchmarkCanSendConcurrent` — parallel goroutines
 								- `BenchmarkCanSendWithPrune` — 500 old + 500 new entries
 								- `BenchmarkStats` — 1,000-entry window
 								- `BenchmarkAllStats` — 5 models x 200 entries
 								- `BenchmarkPersist` — YAML I/O
 								`go test -race ./...` passed clean. `go vet ./...` produced no warnings.
 								---
 								## Phase 1 — Generalisation Beyond Gemini
 								Commit: `3c63b10` — included in the same commit as Phase 0
 								The hardcoded Gemini quotas in `New()` were replaced with a provider-agnostic
 								configuration system without breaking the existing API.
 								### New types and functions
 								- `Provider` string type with constants: `ProviderGemini`, `ProviderOpenAI`,
 								  `ProviderAnthropic`, `ProviderLocal`
 								- `ProviderProfile` — bundles a provider identifier with its model quota map
 								- `Config` — construction configuration accepting `FilePath`, `Backend`,
 								  `Providers`, and `Quotas` fields
 								- `DefaultProfiles()` — returns fresh pre-configured profiles for all four providers
 								- `NewWithConfig(Config)` — creates a limiter from explicit configuration
 								- `SetQuota(model, quota)` — runtime quota modification, mutex-protected
 								- `AddProvider(provider)` — loads all default quotas for a provider at runtime,
 								  additive, mutex-protected
 								### Backward compatibility
 								`New()` delegates to `NewWithConfig(Config{Providers: []Provider{ProviderGemini}})`.
 								`TestNewBackwardCompatibility` asserts exact parity with the original hardcoded
 								values. No existing call sites required modification.
 								### Design decision: merge-on-top
 								Explicit `Config.Quotas` override provider profile defaults. This allows callers
 								to use a provider profile for most models while customising specific model limits
 								without forking the entire profile.
 								---
 								## Phase 2 — SQLite Persistent State
 								Commit: `1afb1d6` — `feat(persist): Phase 2 — SQLite backend with WAL mode`
 								The YAML backend serialises the full state on every `Persist()` call and is
 								not safe for concurrent multi-process access. Phase 2 added a SQLite backend
 								using `modernc.org/sqlite` (pure Go, no CGO) following the go-store pattern
 								established elsewhere in the ecosystem.
 								### New constructors
 								- `NewWithSQLite(dbPath string)` — SQLite-backed limiter with Gemini defaults
 								- `NewWithSQLiteConfig(dbPath string, cfg Config)` — SQLite-backed with custom config
 								- `Close() error` — releases the database connection; no-op on YAML-backed limiters
 								### Migration
 								- `MigrateYAMLToSQLite(yamlPath, sqlitePath string) error` — one-shot migration
 								  helper that reads an existing YAML state file and writes all quotas and usage
 								  state to a new SQLite database
 								### SQLite connection settings
 								- `PRAGMA journal_mode=WAL` — enables concurrent reads alongside a single writer
 								- `PRAGMA busy_timeout=5000` — 5-second wait on lock contention before returning an error
 								- `db.SetMaxOpenConns(1)` — single connection for PRAGMA consistency
 								### Tests added (sqlite_test.go)
 								- `TestNewSQLiteStore_Good / _Bad` — creation and invalid path handling
 								- `TestSQLiteQuotasRoundTrip_Good` — save/load round-trip
-												fix(ratelimit): align module metadata and repo guidance

Co-Authored-By: Virgil <virgil@lethean.io>

											
										
										
											2026-03-27 04:23:34 +00:00
+								- `TestSQLite_QuotasOverwrite_Good` — the latest quota snapshot replaces previous rows
-												docs: graduate TODO/FINDINGS into production documentation

Replace internal task tracking (TODO.md, FINDINGS.md) with structured
documentation in docs/. Trim CLAUDE.md to agent instructions only.

Co-Authored-By: Virgil <virgil@lethean.io>

											
										
										
											2026-02-20 15:01:55 +00:00
+								- `TestSQLiteStateRoundTrip_Good` — multi-model state with nanosecond precision
 								- `TestSQLiteStateOverwrite_Good` — delete-then-insert atomicity
 								- `TestSQLiteEmptyState_Good` — fresh database returns empty maps
 								- `TestNewWithSQLite_Good / TestNewWithSQLiteConfig_Good` — constructor tests
 								- `TestSQLitePersistAndLoad_Good` — full persist + reload cycle
 								- `TestSQLitePersistMultipleModels_Good` — multi-provider persistence
 								- `TestSQLiteConcurrent_Good` — 10 goroutines x 20 ops, race-clean
 								- `TestYAMLBackwardCompat_Good` — existing YAML tests pass unchanged
 								- `TestMigrateYAMLToSQLite_Good / _Bad` — migration round-trip and error paths
 								- `TestSQLiteCorruptDB_Ugly / TestSQLiteTruncatedDB_Ugly` — graceful corrupt DB recovery
 								- `TestSQLiteEndToEnd_Good` — full two-session scenario
 								---
 								## Phase 3 — Integration (Planned)
 								Not yet implemented. Intended downstream integrations:
 								- Wire into `go-ml` backends so rate limiting is enforced automatically on
 								  inference calls without caller involvement
 								- Wire into the `go-ai` facade so all providers share a single rate limit layer
 								- Export metrics (requests/minute, tokens/minute, rejection counts) for
 								  monitoring dashboards
 								---
 								## Known Limitations
-												fix(ratelimit): align module metadata and repo guidance

Co-Authored-By: Virgil <virgil@lethean.io>

											
										
										
											2026-03-27 04:23:34 +00:00
+								**CountTokens URL is hardcoded.** The exported `CountTokens` helper calls
 								`generativelanguage.googleapis.com` directly. Callers cannot redirect it to
 								Gemini-compatible proxies or alternate endpoints without going through an
 								internal helper or refactoring the API to accept a base URL or `http.Client`.
-												docs: graduate TODO/FINDINGS into production documentation

Replace internal task tracking (TODO.md, FINDINGS.md) with structured
documentation in docs/. Trim CLAUDE.md to agent instructions only.

Co-Authored-By: Virgil <virgil@lethean.io>

											
										
										
											2026-02-20 15:01:55 +00:00
 								**saveState is a full table replace.** On every `Persist()` call, the `requests`,
 								`tokens`, and `daily` tables are truncated and rewritten. For a limiter tracking
 								many models with high RPM, this means writing hundreds of rows on every persist
 								call. A future optimisation would use incremental writes (insert-only, with
 								periodic vacuuming of expired rows).
 								**No TTL on SQLite rows.** Historical rows older than one minute are pruned from
 								the in-memory `UsageStats` on every operation but are written wholesale to
 								SQLite on `Persist()`. The database does not grow unboundedly between persist
 								cycles because `saveState` replaces all rows, but if `Persist()` is called
 								frequently the WAL file can grow transiently.
-												feat(ratelimit): add agent decision guidance

Co-Authored-By: Virgil <virgil@lethean.io>

											
										
										
											2026-03-30 08:16:44 +00:00
+								**WaitForCapacity now sleeps using `Decide`’s `RetryAfter` hint** (with a
 								one-second fallback when no hint exists). This reduces busy looping on long
 								windows but remains coarse for sub-second smoothing; callers that need
 								sub-second pacing should implement their own loop.
-												docs: graduate TODO/FINDINGS into production documentation

Replace internal task tracking (TODO.md, FINDINGS.md) with structured
documentation in docs/. Trim CLAUDE.md to agent instructions only.

Co-Authored-By: Virgil <virgil@lethean.io>

											
										
										
											2026-02-20 15:01:55 +00:00
 								**No automatic persistence.** `Persist()` must be called explicitly. If a
 								process exits without calling `Persist()`, any usage recorded since the last
 								persist is lost. Callers are responsible for calling `Persist()` at appropriate
 								intervals (e.g., after each `RecordUsage()` call, or on a ticker).