feat(ratelimit): generalise beyond Gemini with provider profiles and push coverage to 95%
Phase 0: Rewrite test suite with testify (77.1% -> 95.1% coverage). Add boundary tests, concurrent access tests, benchmarks, error path coverage for Load/Persist, Reset, Stats, and CountTokens. Phase 1: Extract hardcoded Gemini quotas into provider-agnostic config. Add Provider type, DefaultProfiles(), NewWithConfig(), SetQuota(), and AddProvider(). Pre-configured profiles for Gemini, OpenAI, Anthropic, and Local. New() retains exact backward compatibility via delegation. Co-Authored-By: Charon <developers@lethean.io>
This commit is contained in:
parent
666deed718
commit
3c63b1022a
6 changed files with 1339 additions and 152 deletions
83
FINDINGS.md
83
FINDINGS.md
|
|
@ -21,3 +21,86 @@ Extracted from `forge.lthn.ai/core/go` on 19 Feb 2026.
|
|||
### Tests
|
||||
|
||||
- 1 test file covering sliding window and quota enforcement
|
||||
|
||||
---
|
||||
|
||||
## 2026-02-20: Phase 0 -- Hardening (Charon)
|
||||
|
||||
### Coverage: 77.1% -> 95.1%
|
||||
|
||||
Rewrote test suite with testify assert/require. Table-driven subtests throughout.
|
||||
|
||||
#### Tests added
|
||||
|
||||
- **CanSend boundaries**: exact RPM/TPM/RPD limits, RPM-only, TPM-only, zero-token estimates, unknown models, unlimited models
|
||||
- **Prune**: keeps recent entries, prunes old ones, daily reset at 24h, boundary-exact timestamps, noop on non-existent model
|
||||
- **RecordUsage**: fresh state, accumulation, existing state
|
||||
- **Reset**: single model, all models (empty string), non-existent model
|
||||
- **WaitForCapacity**: immediate capacity, context cancellation, pre-cancelled context, unknown model
|
||||
- **Stats/AllStats**: known/unknown/quota-only models, pruning in AllStats, daily reset in AllStats
|
||||
- **Persist/Load**: round-trip, non-existent file, corrupt YAML, unreadable file, nested directory creation, unwritable directory
|
||||
- **Concurrency**: 20 goroutines x 50 ops (CanSend + RecordUsage + Stats), concurrent Reset + RecordUsage + AllStats
|
||||
- **Benchmarks**: BenchmarkCanSend (1000-entry window), BenchmarkRecordUsage, BenchmarkCanSendConcurrent
|
||||
|
||||
#### Remaining uncovered (5%)
|
||||
|
||||
- `CountTokens` success path: hardcoded Google URL prevents unit testing without URL injection. Only the connection-error path is covered.
|
||||
- `yaml.Marshal` error in `Persist()`: virtually impossible to trigger with valid structs.
|
||||
- `os.UserHomeDir` error in `NewWithConfig()`: only fails when `$HOME` is unset.
|
||||
|
||||
### Race detector
|
||||
|
||||
`go test -race ./...` passes clean. The `sync.RWMutex` correctly guards all shared state.
|
||||
|
||||
### go vet
|
||||
|
||||
No warnings.
|
||||
|
||||
---
|
||||
|
||||
## 2026-02-20: Phase 1 -- Generalisation (Charon)
|
||||
|
||||
### Problem
|
||||
|
||||
Hardcoded Gemini-specific quotas in `New()`. No way to configure for other providers.
|
||||
|
||||
### Solution
|
||||
|
||||
Introduced provider-agnostic configuration without breaking existing API.
|
||||
|
||||
#### New types
|
||||
|
||||
- `Provider` -- string type with constants: `ProviderGemini`, `ProviderOpenAI`, `ProviderAnthropic`, `ProviderLocal`
|
||||
- `ProviderProfile` -- bundles provider identity with model quotas map
|
||||
- `Config` -- construction config with `FilePath`, `Providers` list, `Quotas` map
|
||||
|
||||
#### New functions
|
||||
|
||||
- `DefaultProfiles()` -- returns pre-configured profiles for all four providers
|
||||
- `NewWithConfig(Config)` -- creates limiter from explicit configuration
|
||||
- `SetQuota(model, quota)` -- runtime quota modification
|
||||
- `AddProvider(provider)` -- loads all default quotas for a provider at runtime
|
||||
|
||||
#### Provider defaults (Feb 2026)
|
||||
|
||||
| Provider | Models | RPM | TPM | RPD |
|
||||
|----------|--------|-----|-----|-----|
|
||||
| Gemini | gemini-3-pro-preview, gemini-3-flash-preview, gemini-2.5-pro | 150 | 1M | 1000 |
|
||||
| Gemini | gemini-2.0-flash | 150 | 1M | unlimited |
|
||||
| Gemini | gemini-2.0-flash-lite | unlimited | unlimited | unlimited |
|
||||
| OpenAI | gpt-4o, gpt-4-turbo, o1 | 500 | 30K | unlimited |
|
||||
| OpenAI | gpt-4o-mini, o1-mini, o3-mini | 500 | 200K | unlimited |
|
||||
| Anthropic | claude-opus-4, claude-sonnet-4 | 50 | 40K | unlimited |
|
||||
| Anthropic | claude-haiku-3.5 | 50 | 50K | unlimited |
|
||||
| Local | (none by default) | -- | -- | -- |
|
||||
|
||||
#### Backward compatibility
|
||||
|
||||
`New()` delegates to `NewWithConfig(Config{Providers: []Provider{ProviderGemini}})`. Verified by `TestNewBackwardCompatibility` which asserts exact parity with the original hardcoded values.
|
||||
|
||||
#### Design notes
|
||||
|
||||
- Explicit quotas in `Config.Quotas` override provider defaults (merge-on-top pattern)
|
||||
- Local provider has no default quotas -- users add per-model limits for hardware throttling
|
||||
- `AddProvider()` is additive -- calling it does not remove existing quotas
|
||||
- All new methods are mutex-protected and safe for concurrent use
|
||||
|
|
|
|||
21
TODO.md
21
TODO.md
|
|
@ -1,4 +1,4 @@
|
|||
# TODO.md — go-ratelimit
|
||||
# TODO.md -- go-ratelimit
|
||||
|
||||
Dispatched from core/go orchestration. Pick up tasks in order.
|
||||
|
||||
|
|
@ -6,20 +6,23 @@ Dispatched from core/go orchestration. Pick up tasks in order.
|
|||
|
||||
## Phase 0: Hardening & Test Coverage
|
||||
|
||||
- [ ] **Expand test coverage** — `ratelimit_test.go` exists. Add tests for: `CanSend()` at exact limits (RPM, TPM, RPD boundaries), `RecordUsage()` with concurrent goroutines (race test), `WaitForCapacity()` timeout behaviour, `prune()` sliding window edge cases, daily reset logic (cross-midnight), YAML persistence (save + reload state), empty/corrupt state file recovery.
|
||||
- [ ] **Race condition test** — `go test -race ./...` with 10 goroutines calling `CanSend()` + `RecordUsage()` concurrently. The `sync.RWMutex` should handle it but verify.
|
||||
- [ ] **Benchmark** — Add `BenchmarkCanSend` and `BenchmarkRecordUsage` with 1000 entries in sliding window. Measure prune() overhead.
|
||||
- [ ] **`go vet ./...` clean** — Fix any warnings.
|
||||
- [x] **Expand test coverage** -- `ratelimit_test.go` rewritten with testify. Tests for: `CanSend()` at exact limits (RPM, TPM, RPD boundaries), `RecordUsage()` with concurrent goroutines, `WaitForCapacity()` timeout and immediate-capacity paths, `prune()` sliding window edge cases, daily reset logic (24h boundary), YAML persistence (save + reload), corrupt/unreadable state file recovery, `Reset()` single/all/nonexistent, `Stats()` known/unknown/quota-only models, `AllStats()` with pruning and daily reset.
|
||||
- [x] **Race condition test** -- `go test -race ./...` with 20 goroutines calling `CanSend()` + `RecordUsage()` + `Stats()` concurrently. Additional test with concurrent `Reset()` + `RecordUsage()` + `AllStats()`. All pass clean.
|
||||
- [x] **Benchmark** -- `BenchmarkCanSend` (1000-entry window), `BenchmarkRecordUsage`, `BenchmarkCanSendConcurrent` (parallel). Measures prune() overhead.
|
||||
- [x] **`go vet ./...` clean** -- No warnings.
|
||||
- **Coverage: 95.1%** (up from 77.1%). Remaining uncovered: `CountTokens` success path (hardcoded Google URL), `yaml.Marshal` error path in `Persist()`, `os.UserHomeDir` error path in `NewWithConfig`.
|
||||
|
||||
## Phase 1: Generalise Beyond Gemini
|
||||
|
||||
- [ ] Hardcoded model quotas are Gemini-specific — abstract to provider-agnostic config
|
||||
- [ ] Add quota profiles for OpenAI, Anthropic, and local (Ollama/MLX) backends
|
||||
- [ ] Make default quotas configurable via YAML or environment variables
|
||||
- [x] **Provider-agnostic config** -- Added `Provider` type, `ProviderProfile`, `Config` struct, `NewWithConfig()` constructor. Quotas are no longer hardcoded in `New()`.
|
||||
- [x] **Quota profiles** -- `DefaultProfiles()` returns pre-configured profiles for Gemini, OpenAI (gpt-4o, o1, o3-mini), Anthropic (claude-opus-4, claude-sonnet-4, claude-haiku-3.5), and Local (empty, user-configurable).
|
||||
- [x] **Configurable defaults** -- `Config` struct accepts `FilePath`, `Providers` list, and explicit `Quotas` map. Explicit quotas override provider defaults. YAML-serialisable.
|
||||
- [x] **Backward compatibility** -- `New()` delegates to `NewWithConfig(Config{Providers: []Provider{ProviderGemini}})`. Existing API unchanged. Test `TestNewBackwardCompatibility` verifies exact parity.
|
||||
- [x] **Runtime configuration** -- `SetQuota()` and `AddProvider()` allow modifying quotas after construction. Both are mutex-protected.
|
||||
|
||||
## Phase 2: Persistent State
|
||||
|
||||
- [ ] Currently stores state in YAML file — not safe for multi-process access
|
||||
- [ ] Currently stores state in YAML file -- not safe for multi-process access
|
||||
- [ ] Consider SQLite for concurrent read/write safety (WAL mode)
|
||||
- [ ] Add state recovery on restart (reload sliding window from persisted data)
|
||||
|
||||
|
|
|
|||
6
go.mod
6
go.mod
|
|
@ -3,3 +3,9 @@ module forge.lthn.ai/core/go-ratelimit
|
|||
go 1.25.5
|
||||
|
||||
require gopkg.in/yaml.v3 v3.0.1
|
||||
|
||||
require (
|
||||
github.com/davecgh/go-spew v1.1.1 // indirect
|
||||
github.com/pmezard/go-difflib v1.0.0 // indirect
|
||||
github.com/stretchr/testify v1.11.1
|
||||
)
|
||||
|
|
|
|||
6
go.sum
6
go.sum
|
|
@ -1,3 +1,9 @@
|
|||
github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=
|
||||
github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
|
||||
github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
|
||||
github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
|
||||
github.com/stretchr/testify v1.11.1 h1:7s2iGBzp5EwR7/aIZr8ao5+dra3wiQyKjjFuvgVKu7U=
|
||||
github.com/stretchr/testify v1.11.1/go.mod h1:wZwfW3scLgRK+23gO65QZefKpKQRnfz6sD981Nm4B6U=
|
||||
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405 h1:yhCVgyC4o1eVCa2tZl7eS0r+SDo693bJlVdllGtEeKM=
|
||||
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
|
||||
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
|
||||
|
|
|
|||
153
ratelimit.go
153
ratelimit.go
|
|
@ -15,13 +15,48 @@ import (
|
|||
"gopkg.in/yaml.v3"
|
||||
)
|
||||
|
||||
// Provider identifies an LLM provider for quota profiles.
|
||||
type Provider string
|
||||
|
||||
const (
|
||||
// ProviderGemini is Google's Gemini family (default).
|
||||
ProviderGemini Provider = "gemini"
|
||||
// ProviderOpenAI is OpenAI's GPT/o-series family.
|
||||
ProviderOpenAI Provider = "openai"
|
||||
// ProviderAnthropic is Anthropic's Claude family.
|
||||
ProviderAnthropic Provider = "anthropic"
|
||||
// ProviderLocal is for local inference (Ollama, MLX, llama.cpp).
|
||||
ProviderLocal Provider = "local"
|
||||
)
|
||||
|
||||
// ModelQuota defines the rate limits for a specific model.
|
||||
type ModelQuota struct {
|
||||
MaxRPM int `yaml:"max_rpm"` // Requests per minute
|
||||
MaxTPM int `yaml:"max_tpm"` // Tokens per minute
|
||||
MaxRPM int `yaml:"max_rpm"` // Requests per minute (0 = unlimited)
|
||||
MaxTPM int `yaml:"max_tpm"` // Tokens per minute (0 = unlimited)
|
||||
MaxRPD int `yaml:"max_rpd"` // Requests per day (0 = unlimited)
|
||||
}
|
||||
|
||||
// ProviderProfile bundles model quotas for a provider.
|
||||
type ProviderProfile struct {
|
||||
Provider Provider `yaml:"provider"`
|
||||
Models map[string]ModelQuota `yaml:"models"`
|
||||
}
|
||||
|
||||
// Config controls RateLimiter initialisation.
|
||||
type Config struct {
|
||||
// FilePath overrides the default state file location.
|
||||
// If empty, defaults to ~/.core/ratelimits.yaml.
|
||||
FilePath string `yaml:"file_path,omitempty"`
|
||||
|
||||
// Quotas sets per-model rate limits directly.
|
||||
// These are merged on top of any provider profile defaults.
|
||||
Quotas map[string]ModelQuota `yaml:"quotas,omitempty"`
|
||||
|
||||
// Providers lists provider profiles to load.
|
||||
// If empty and Quotas is also empty, Gemini defaults are used.
|
||||
Providers []Provider `yaml:"providers,omitempty"`
|
||||
}
|
||||
|
||||
// TokenEntry records a token usage event.
|
||||
type TokenEntry struct {
|
||||
Time time.Time `yaml:"time"`
|
||||
|
|
@ -44,29 +79,121 @@ type RateLimiter struct {
|
|||
filePath string
|
||||
}
|
||||
|
||||
// New creates a new RateLimiter with default quotas.
|
||||
// DefaultProfiles returns pre-configured quota profiles for each provider.
|
||||
// Values are based on published rate limits as of Feb 2026.
|
||||
func DefaultProfiles() map[Provider]ProviderProfile {
|
||||
return map[Provider]ProviderProfile{
|
||||
ProviderGemini: {
|
||||
Provider: ProviderGemini,
|
||||
Models: map[string]ModelQuota{
|
||||
"gemini-3-pro-preview": {MaxRPM: 150, MaxTPM: 1000000, MaxRPD: 1000},
|
||||
"gemini-3-flash-preview": {MaxRPM: 150, MaxTPM: 1000000, MaxRPD: 1000},
|
||||
"gemini-2.5-pro": {MaxRPM: 150, MaxTPM: 1000000, MaxRPD: 1000},
|
||||
"gemini-2.0-flash": {MaxRPM: 150, MaxTPM: 1000000, MaxRPD: 0}, // Unlimited RPD
|
||||
"gemini-2.0-flash-lite": {MaxRPM: 0, MaxTPM: 0, MaxRPD: 0}, // Unlimited
|
||||
},
|
||||
},
|
||||
ProviderOpenAI: {
|
||||
Provider: ProviderOpenAI,
|
||||
Models: map[string]ModelQuota{
|
||||
"gpt-4o": {MaxRPM: 500, MaxTPM: 30000, MaxRPD: 0},
|
||||
"gpt-4o-mini": {MaxRPM: 500, MaxTPM: 200000, MaxRPD: 0},
|
||||
"gpt-4-turbo": {MaxRPM: 500, MaxTPM: 30000, MaxRPD: 0},
|
||||
"o1": {MaxRPM: 500, MaxTPM: 30000, MaxRPD: 0},
|
||||
"o1-mini": {MaxRPM: 500, MaxTPM: 200000, MaxRPD: 0},
|
||||
"o3-mini": {MaxRPM: 500, MaxTPM: 200000, MaxRPD: 0},
|
||||
},
|
||||
},
|
||||
ProviderAnthropic: {
|
||||
Provider: ProviderAnthropic,
|
||||
Models: map[string]ModelQuota{
|
||||
"claude-opus-4": {MaxRPM: 50, MaxTPM: 40000, MaxRPD: 0},
|
||||
"claude-sonnet-4": {MaxRPM: 50, MaxTPM: 40000, MaxRPD: 0},
|
||||
"claude-haiku-3.5": {MaxRPM: 50, MaxTPM: 50000, MaxRPD: 0},
|
||||
},
|
||||
},
|
||||
ProviderLocal: {
|
||||
Provider: ProviderLocal,
|
||||
Models: map[string]ModelQuota{
|
||||
// Local inference has no external rate limits by default.
|
||||
// Users can override per-model if their hardware requires throttling.
|
||||
},
|
||||
},
|
||||
}
|
||||
}
|
||||
|
||||
// New creates a new RateLimiter with Gemini defaults.
|
||||
// This preserves backward compatibility -- existing callers are unaffected.
|
||||
func New() (*RateLimiter, error) {
|
||||
home, err := os.UserHomeDir()
|
||||
if err != nil {
|
||||
return nil, err
|
||||
return NewWithConfig(Config{
|
||||
Providers: []Provider{ProviderGemini},
|
||||
})
|
||||
}
|
||||
|
||||
// NewWithConfig creates a RateLimiter from explicit configuration.
|
||||
// If no providers or quotas are specified, Gemini defaults are used.
|
||||
func NewWithConfig(cfg Config) (*RateLimiter, error) {
|
||||
filePath := cfg.FilePath
|
||||
if filePath == "" {
|
||||
home, err := os.UserHomeDir()
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
filePath = filepath.Join(home, ".core", "ratelimits.yaml")
|
||||
}
|
||||
|
||||
rl := &RateLimiter{
|
||||
Quotas: make(map[string]ModelQuota),
|
||||
State: make(map[string]*UsageStats),
|
||||
filePath: filepath.Join(home, ".core", "ratelimits.yaml"),
|
||||
filePath: filePath,
|
||||
}
|
||||
|
||||
// Default quotas based on Tier 1 observations (Feb 2026)
|
||||
rl.Quotas["gemini-3-pro-preview"] = ModelQuota{MaxRPM: 150, MaxTPM: 1000000, MaxRPD: 1000}
|
||||
rl.Quotas["gemini-3-flash-preview"] = ModelQuota{MaxRPM: 150, MaxTPM: 1000000, MaxRPD: 1000}
|
||||
rl.Quotas["gemini-2.5-pro"] = ModelQuota{MaxRPM: 150, MaxTPM: 1000000, MaxRPD: 1000}
|
||||
rl.Quotas["gemini-2.0-flash"] = ModelQuota{MaxRPM: 150, MaxTPM: 1000000, MaxRPD: 0} // Unlimited RPD
|
||||
rl.Quotas["gemini-2.0-flash-lite"] = ModelQuota{MaxRPM: 0, MaxTPM: 0, MaxRPD: 0} // Unlimited
|
||||
// Load provider profiles
|
||||
profiles := DefaultProfiles()
|
||||
providers := cfg.Providers
|
||||
|
||||
// If nothing specified at all, default to Gemini
|
||||
if len(providers) == 0 && len(cfg.Quotas) == 0 {
|
||||
providers = []Provider{ProviderGemini}
|
||||
}
|
||||
|
||||
for _, p := range providers {
|
||||
if profile, ok := profiles[p]; ok {
|
||||
for model, quota := range profile.Models {
|
||||
rl.Quotas[model] = quota
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Merge explicit quotas on top (allows overrides)
|
||||
for model, quota := range cfg.Quotas {
|
||||
rl.Quotas[model] = quota
|
||||
}
|
||||
|
||||
return rl, nil
|
||||
}
|
||||
|
||||
// SetQuota sets or updates the quota for a specific model at runtime.
|
||||
func (rl *RateLimiter) SetQuota(model string, quota ModelQuota) {
|
||||
rl.mu.Lock()
|
||||
defer rl.mu.Unlock()
|
||||
rl.Quotas[model] = quota
|
||||
}
|
||||
|
||||
// AddProvider loads all default quotas for a provider.
|
||||
// Existing quotas for models in the profile are overwritten.
|
||||
func (rl *RateLimiter) AddProvider(provider Provider) {
|
||||
rl.mu.Lock()
|
||||
defer rl.mu.Unlock()
|
||||
|
||||
profiles := DefaultProfiles()
|
||||
if profile, ok := profiles[provider]; ok {
|
||||
for model, quota := range profile.Models {
|
||||
rl.Quotas[model] = quota
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Load reads the state from disk.
|
||||
func (rl *RateLimiter) Load() error {
|
||||
rl.mu.Lock()
|
||||
|
|
|
|||
1222
ratelimit_test.go
1222
ratelimit_test.go
File diff suppressed because it is too large
Load diff
Loading…
Add table
Reference in a new issue