docs: graduate TODO/FINDINGS into production documentation

Replace internal task tracking (TODO.md, FINDINGS.md) with structured
documentation in docs/. Trim CLAUDE.md to agent instructions only.

Co-Authored-By: Virgil <virgil@lethean.io>
This commit is contained in:
Snider 2026-02-20 15:01:55 +00:00
parent f202cfe218
commit c570f08eba
6 changed files with 561 additions and 201 deletions

View file

@ -10,7 +10,7 @@ SQLite key-value store wrapper with TTL support and namespace isolation. Module:
go test ./... # Run all tests
go test -v -run Name # Run single test
go test -race ./... # Race detector
go test -cover ./... # Coverage (target: 90%+)
go test -cover ./... # Coverage (target: 95%+)
go test -bench=. ./... # Benchmarks
```
@ -63,6 +63,12 @@ defer unreg()
## Coding Standards
- UK English
- `go test ./...` must pass before commit
- `go test -race ./...` must pass before commit
- Conventional commits: `type(scope): description`
- Co-Author: `Co-Authored-By: Virgil <virgil@lethean.io>`
## Docs
- `docs/architecture.md` — storage layer, group/key model, TTL, events, scoping
- `docs/development.md` — prerequisites, test patterns, benchmarks, adding methods
- `docs/history.md` — completed phases, known limitations, future considerations

View file

@ -1,69 +0,0 @@
# FINDINGS.md -- go-store
## 2026-02-19: Split from core/go (Virgil)
### Origin
Extracted from `forge.lthn.ai/core/go` `pkg/store/` on 19 Feb 2026.
### Architecture
- SQLite database with WAL mode enabled for concurrent read performance
- Compound primary key: `(group, key)` -- groups act as logical namespaces
- UPSERT semantics on write (INSERT OR REPLACE)
- Template rendering support via Go `text/template` for dynamic values
- Pure Go SQLite driver via `modernc.org/sqlite` (no CGO required)
### Dependencies
- `modernc.org/sqlite` -- pure Go SQLite implementation
### Tests
- 1 test file covering CRUD operations, group isolation, and template rendering
## 2026-02-20: Phase 0 & Phase 1 (Charon)
### Concurrency Fix: SQLITE_BUSY under contention
**Problem:** With `database/sql`'s default connection pool, concurrent goroutines would get different connections from the pool. PRAGMA statements (journal_mode, busy_timeout) are per-connection, so some connections would lack the WAL and busy_timeout settings. Under heavy concurrent writes (10 goroutines x 100 ops), this caused persistent `SQLITE_BUSY` errors even with busy_timeout set.
**Root cause:** `database/sql` pools connections. Each `Exec`/`Query` call might use a different connection. PRAGMAs applied to one connection do not apply to others.
**Fix:** `db.SetMaxOpenConns(1)` serialises all access through a single connection. This ensures PRAGMAs stick and eliminates BUSY errors. SQLite is single-writer by design, so there is no concurrency penalty -- the pool was already being serialised at the SQLite lock level, just with errors instead of queueing.
**Added:** `PRAGMA busy_timeout=5000` as a defence-in-depth measure.
### Coverage: 73.1% -> 90.9%
Remaining uncovered code (9.1%) is entirely defensive error handling in:
- `New()`: sql.Open error, busy_timeout pragma error, schema creation error -- all unreachable with the modernc.org/sqlite driver under normal operation
- `GetAll()`/`Render()`: rows.Scan and rows.Err error paths -- require internal SQLite driver corruption to trigger
These are correct defensive checks that protect against hypothetical driver/OS failures but cannot be triggered through integration tests against a real SQLite database.
### Phase 1: TTL Support
Added optional time-to-live for keys:
- `expires_at INTEGER` nullable column in the `kv` table
- `SetWithTTL(group, key, value, duration)` stores keys that auto-expire
- `Get()` performs lazy deletion of expired keys on read
- `Count()`, `GetAll()`, `Render()` all exclude expired entries from results
- `PurgeExpired()` public method for manual cleanup
- Background goroutine purges expired entries every 60 seconds (configurable via `purgeInterval` field)
- Schema migration: `ALTER TABLE kv ADD COLUMN expires_at INTEGER` handles databases created before TTL support
- `Set()` clears TTL when overwriting a key (sets `expires_at = NULL`)
### Benchmarks
```
BenchmarkSet-32 119280 10290 ns/op 328 B/op 12 allocs/op
BenchmarkGet-32 335707 3589 ns/op 576 B/op 21 allocs/op
BenchmarkGetAll-32 (10K keys) 258 4741451 ns/op 2268787 B/op 80095 allocs/op
BenchmarkSet_FileBacked-32 4525 265868 ns/op 327 B/op 12 allocs/op
```
- In-memory Set: ~97K ops/sec
- In-memory Get: ~279K ops/sec
- File-backed Set: ~3.8K ops/sec (dominated by fsync)
- GetAll with 10K keys: ~2.3MB allocated per call

130
TODO.md
View file

@ -1,130 +0,0 @@
# TODO.md -- go-store
Dispatched from core/go orchestration. Pick up tasks in order.
---
## Phase 0: Hardening & Test Coverage
- [x] **Expand test coverage** -- concurrent Set/Get with 10 goroutines (race test), Render() with invalid template syntax, Render() with missing template vars, Get() on non-existent group vs non-existent key, DeleteGroup() then verify GetAll() returns empty, Count() after bulk inserts, :memory: vs file-backed store, WAL mode verification. Coverage: 73.1% -> 90.9%.
- [x] **Edge cases** -- empty key, empty value, empty group, very long key (10K chars), binary-ish value (null bytes), Unicode keys and values, CJK, Arabic, SQL injection attempts, special characters.
- [x] **Benchmark** -- BenchmarkSet, BenchmarkGet, BenchmarkGetAll with 10K keys, BenchmarkSet_FileBacked.
- [x] **`go vet ./...` clean** -- no warnings.
- [x] **Concurrency fix** -- Added `db.SetMaxOpenConns(1)` and `PRAGMA busy_timeout=5000` to prevent SQLITE_BUSY errors under concurrent writes.
## Phase 1: TTL Support
- [x] Add optional expiry timestamp for keys (`expires_at INTEGER` column)
- [x] Background goroutine to purge expired entries (configurable interval, default 60s)
- [x] `SetWithTTL(group, key, value, duration)` API
- [x] Lazy expiry check on `Get` as fallback
- [x] `PurgeExpired()` public method for manual purge
- [x] `Count`, `GetAll`, `Render` exclude expired entries
- [x] Schema migration for pre-TTL databases (ALTER TABLE ADD COLUMN)
- [x] Tests for all TTL functionality including concurrent TTL access
## Phase 2: Namespace Isolation
Scoped store wrapper that auto-prefixes groups with a namespace to prevent key collisions across tenants. Pure Go, no new deps.
### 2.1 ScopedStore Wrapper
- [x] **Create `scope.go`** — A lightweight wrapper around `*Store` that auto-prefixes all group names:
- `type ScopedStore struct { store *Store; namespace string }` — holds reference to underlying store and namespace prefix
- `func NewScoped(store *Store, namespace string) *ScopedStore` — constructor. Validates namespace is non-empty, alphanumeric + hyphens only.
- `func (s *ScopedStore) prefix(group string) string` — returns `namespace + ":" + group`
- Implement all `Store` methods on `ScopedStore` that delegate to the underlying store with prefixed groups:
- `Get(group, key)`, `Set(group, key, value)`, `SetWithTTL(group, key, value, ttl)`
- `Delete(group, key)`, `DeleteGroup(group)`
- `GetAll(group)`, `Count(group)`
- `Render(group, key, data)`
- Each method simply calls `s.store.Method(s.prefix(group), key, ...)` — thin delegation, no logic duplication.
### 2.2 Quota Enforcement
- [x] **Add `QuotaConfig` to ScopedStore** — Optional quota limits per namespace:
- `type QuotaConfig struct { MaxKeys int; MaxGroups int }` — zero means unlimited
- `func NewScopedWithQuota(store *Store, namespace string, quota QuotaConfig) *ScopedStore`
- `var ErrQuotaExceeded = errors.New("store: quota exceeded")`
- [x] **Enforce on Set()** — Before inserting, check `Count()` across all groups with the namespace prefix. If `MaxKeys > 0` and current count >= MaxKeys, return `ErrQuotaExceeded`. Only check on new keys (UPSERT existing keys doesn't increase count).
- [x] **Enforce on group creation** — Track distinct groups with the namespace prefix. If `MaxGroups > 0` and adding a new group would exceed the limit, return `ErrQuotaExceeded`.
- [x] **Add `CountAll() (int, error)` to Store** — Returns total key count across ALL groups matching a prefix. SQL: `SELECT COUNT(*) FROM kv WHERE grp LIKE ? AND (expires_at IS NULL OR expires_at > ?)` with `namespace + ":%"`.
- [x] **Add `Groups() ([]string, error)` to Store** — Returns distinct group names. SQL: `SELECT DISTINCT grp FROM kv WHERE (expires_at IS NULL OR expires_at > ?)`. Useful for quota checks and admin tooling.
### 2.3 Tests
- [x] **ScopedStore basic tests** — Set/Get/Delete through ScopedStore, verify underlying store has prefixed groups, two namespaces don't collide, GetAll returns only scoped group's keys
- [x] **Quota tests** — (a) MaxKeys=5, insert 5 keys → OK, insert 6th → ErrQuotaExceeded, (b) UPSERT existing key doesn't count towards quota, (c) Delete + re-insert stays within quota, (d) MaxGroups=3, create 3 groups → OK, 4th → ErrQuotaExceeded, (e) zero quota = unlimited, (f) TTL-expired keys don't count towards quota
- [x] **CountAll/Groups tests** — (a) CountAll with mixed namespaces, (b) Groups returns distinct list, (c) expired keys excluded from both
- [x] **Existing tests still pass** — No changes to Store API, backward compatible. Coverage: 90.9% → 94.7%.
## Phase 3: Event Hooks
Reactive notification system for store mutations. Pure Go, no new deps. The go-ws integration point is via callbacks — go-store does NOT import go-ws.
### 3.1 Event Types (`events.go`)
- [x] **Create `events.go`** — Define the event model:
- `type EventType int` with constants: `EventSet`, `EventDelete`, `EventDeleteGroup`
- `type Event struct { Type EventType; Group string; Key string; Value string; Timestamp time.Time }` — Key is empty for `EventDeleteGroup`, Value is only populated for `EventSet`
- `func (t EventType) String() string` — returns `"set"`, `"delete"`, `"delete_group"`
### 3.2 Watcher API
- [x] **Add watcher infrastructure to Store** — New fields on `Store`:
- `watchers []*Watcher` — registered watchers
- `callbacks []callbackEntry` — registered callbacks
- `mu sync.RWMutex` — protects watchers and callbacks (separate from SQLite serialisation)
- `nextID uint64` — monotonic ID for callbacks
- [x] **`type Watcher struct`** — `Ch <-chan Event` (public read-only channel), `ch chan Event` (internal write), `group string`, `key string`, `id uint64`
- [x] **`func (s *Store) Watch(group, key string) *Watcher`** — Create a watcher with buffered channel (cap 16). `"*"` as key matches all keys in the group. `"*"` for both group and key matches everything. Returns the watcher.
- [x] **`func (s *Store) Unwatch(w *Watcher)`** — Remove watcher from slice, close its channel. Safe to call multiple times.
### 3.3 Callback Hook
- [x] **`func (s *Store) OnChange(fn func(Event)) func()`** — Register a callback for all mutations. Returns an unregister function. Callbacks are called synchronously in the emitting goroutine (caller controls concurrency). This is the go-ws integration point — consumers do:
```go
unreg := store.OnChange(func(e store.Event) {
hub.SendToChannel("store-events", e)
})
defer unreg()
```
### 3.4 Emit Events
- [x] **Modify `Set()`** — After successful DB write, call `s.notify(Event{Type: EventSet, Group: group, Key: key, Value: value, Timestamp: time.Now()})`
- [x] **Modify `SetWithTTL()`** — Same as Set but includes TTL event
- [x] **Modify `Delete()`** — Emit `EventDelete` after successful DB write
- [x] **Modify `DeleteGroup()`** — Emit `EventDeleteGroup` with Key="" after successful DB write
- [x] **`func (s *Store) notify(e Event)`** — Internal method:
1. Lock `s.mu` (read lock), iterate watchers: if watcher matches (group/key or wildcard), non-blocking send to `w.ch` (drop if full — don't block writer)
2. Call each callback `fn(e)` synchronously
3. Unlock
### 3.5 ScopedStore Events
- [x] **ScopedStore mutations emit events with full prefixed group** — No extra work needed since ScopedStore delegates to Store methods which already emit. The Event.Group will contain the full `namespace:group` string, which is correct for consumers.
### 3.6 Tests (`events_test.go`)
- [x] **Watch specific key** — Set triggers event on matching watcher, non-matching key gets nothing
- [x] **Watch wildcard `"*"`** — Multiple Sets to different keys in same group all trigger
- [x] **Watch all `("*", "*")`** — All mutations across all groups trigger
- [x] **Unwatch stops delivery** — After Unwatch, no more events on channel, channel is closed
- [x] **Delete triggers event** — EventDelete with correct group/key
- [x] **DeleteGroup triggers event** — EventDeleteGroup with empty Key
- [x] **OnChange callback fires** — Register callback, Set/Delete triggers it
- [x] **OnChange unregister** — After calling returned func, callback stops firing
- [x] **Buffer-full doesn't block** — Fill channel buffer (16 events), verify next Set doesn't block/deadlock
- [x] **Multiple watchers on same key** — Both receive events independently
- [x] **Concurrent Watch/Unwatch** — 10 goroutines adding/removing watchers while Sets happen (race test)
- [x] **ScopedStore events** — ScopedStore Set triggers event with prefixed group name
- [x] **Existing tests still pass** — No regressions. Coverage: 94.7% -> 95.5%.
---
## Workflow
1. Virgil in core/go writes tasks here after research
2. This repo's dedicated session picks up tasks in phase order
3. Mark `[x]` when done, note commit hash

218
docs/architecture.md Normal file
View file

@ -0,0 +1,218 @@
# Architecture — go-store
Module: `forge.lthn.ai/core/go-store`
## Overview
go-store is a group-namespaced key-value store backed by SQLite. It provides persistent or in-memory storage with optional TTL expiry, namespace isolation for multi-tenant use, and a reactive event system for observing mutations.
The package has no external runtime dependencies beyond a pure-Go SQLite driver (`modernc.org/sqlite`). It requires no CGO and compiles on all platforms.
## Storage Layer
### SQLite with WAL Mode
Every `Store` instance opens a single SQLite database and immediately applies two pragmas:
```sql
PRAGMA journal_mode=WAL;
PRAGMA busy_timeout=5000;
```
WAL (Write-Ahead Logging) mode allows concurrent readers to proceed without blocking writers. The `busy_timeout` of 5000 milliseconds causes the driver to wait and retry rather than immediately returning `SQLITE_BUSY` under write contention.
**Single connection constraint.** The `database/sql` package maintains a connection pool by default. SQLite pragmas are per-connection: if the pool hands out a second connection, that connection inherits none of the WAL or busy-timeout settings, causing `SQLITE_BUSY` errors under concurrent load. go-store calls `db.SetMaxOpenConns(1)` to pin all access to a single connection. Since SQLite serialises writes at the file level regardless, this introduces no additional throughput penalty.
### Schema
```sql
CREATE TABLE IF NOT EXISTS kv (
grp TEXT NOT NULL,
key TEXT NOT NULL,
value TEXT NOT NULL,
expires_at INTEGER,
PRIMARY KEY (grp, key)
)
```
The compound primary key `(grp, key)` enforces uniqueness per group-key pair and provides efficient indexed lookups. The `expires_at` column stores a Unix millisecond timestamp (nullable); a `NULL` value means the key never expires.
**Schema migration.** Databases created before TTL support lacked the `expires_at` column. On `New()`, go-store runs `ALTER TABLE kv ADD COLUMN expires_at INTEGER`. If the column already exists, SQLite returns a "duplicate column" error which is silently ignored. This allows seamless upgrades of existing databases.
## Group/Key Model
Keys are addressed by a two-level path: `(group, key)`. Groups act as logical namespaces within a single database. Groups are implicit — they exist as a consequence of the keys they contain and are destroyed automatically when all their keys are deleted.
This model maps naturally to domain concepts:
```
group: "user:42:config" key: "theme"
group: "user:42:config" key: "language"
group: "session:abc" key: "token"
```
All read operations (`Get`, `GetAll`, `Count`, `Render`) are scoped to a single group. `DeleteGroup` atomically removes all keys in a group. `CountAll` and `Groups` operate across groups by prefix match.
## UPSERT Semantics
All writes use `INSERT ... ON CONFLICT(grp, key) DO UPDATE`. This means:
- Inserting a new key creates it.
- Inserting an existing key overwrites its value and (for `Set`) clears any TTL.
- UPSERT never duplicates a key.
- The operation is idempotent with respect to row count.
`Set` clears `expires_at` on upsert by setting it to `NULL`. `SetWithTTL` refreshes the expiry timestamp on upsert.
## TTL Expiry
Keys may be created with a time-to-live via `SetWithTTL`. Expiry is stored as a Unix millisecond timestamp in `expires_at`.
**Expiry is enforced in three ways:**
1. **Lazy deletion on `Get`.** If a key is found but its `expires_at` is in the past, it is deleted synchronously before returning `ErrNotFound`. This prevents stale values from being returned even if the background purge has not run yet.
2. **Query-time filtering.** All bulk operations (`GetAll`, `Count`, `Render`, `CountAll`, `Groups`) include `(expires_at IS NULL OR expires_at > ?)` in their `WHERE` clause. Expired keys are excluded from results without being deleted.
3. **Background purge goroutine.** `New()` launches a goroutine that calls `PurgeExpired()` every 60 seconds (configurable via `s.purgeInterval`). This recovers disk space by physically removing expired rows. The goroutine is stopped cleanly by `Close()` via `context.WithCancel`.
`PurgeExpired()` is also a public method for applications that want manual control over purge timing.
## Template Rendering
`Render(tmplStr, group string)` is a convenience method that fetches all non-expired key-value pairs from a group and renders a Go `text/template` against them. The template data is a `map[string]string` keyed by the field name.
Example:
```go
st.Set("miner", "pool", "pool.lthn.io:3333")
st.Set("miner", "wallet", "iz...")
out, _ := st.Render(`{"pool":"{{ .pool }}","wallet":"{{ .wallet }}"}`, "miner")
// out: {"pool":"pool.lthn.io:3333","wallet":"iz..."}
```
Template parse errors and execution errors are both returned as wrapped errors with context (e.g., `store.Render: parse: ...` and `store.Render: exec: ...`).
Missing template variables do not return an error by default — Go's `text/template` renders them as `<no value>`. Applications requiring strict variable presence should set `Option("missingkey=error")` on their own template instances before calling `Render`, or validate data beforehand.
## Watch/Unwatch Pattern
`Watch(group, key string) *Watcher` registers a subscription that receives events as they occur. Each `Watcher` holds a buffered channel (`Ch <-chan Event`) with capacity 16.
**Filter semantics:**
| group argument | key argument | Receives |
|---|---|---|
| `"mygroup"` | `"mykey"` | Only mutations to that exact key |
| `"mygroup"` | `"*"` | All mutations within the group, including `DeleteGroup` |
| `"*"` | `"*"` | Every mutation across the entire store |
`Unwatch(w *Watcher)` removes the watcher from the registry and closes its channel. It is safe to call multiple times — subsequent calls are no-ops.
**Backpressure.** Event dispatch to a watcher channel is non-blocking: if the channel buffer is full, the event is dropped silently. This prevents a slow consumer from blocking a writer. Applications that cannot afford dropped events should drain the channel promptly or use `OnChange` callbacks instead.
**Idiomatic usage:**
```go
w := st.Watch("config", "*")
defer st.Unwatch(w)
for e := range w.Ch {
fmt.Println(e.Type, e.Group, e.Key, e.Value)
}
```
## OnChange Callbacks
`OnChange(fn func(Event)) func()` registers a synchronous callback that fires on every mutation. The callback runs in the goroutine that performed the write, holding the watcher/callback read-lock. Callers must not call store methods from within a callback (deadlock risk) and should offload any significant work to a goroutine.
`OnChange` returns an unregister function. Calling it removes the callback from the registry. The unregister function is idempotent.
This is the designed integration point for go-ws:
```go
unreg := st.OnChange(func(e store.Event) {
hub.SendToChannel("store-events", e)
})
defer unreg()
```
go-store does not import go-ws. The dependency flows in one direction only: go-ws (or any consumer) imports go-store.
## Event Model
Events are defined in `events.go`:
```go
type Event struct {
Type EventType
Group string
Key string
Value string
Timestamp time.Time
}
```
| EventType | String() | Key populated | Value populated |
|---|---|---|---|
| `EventSet` | `"set"` | Yes | Yes |
| `EventDelete` | `"delete"` | Yes | No |
| `EventDeleteGroup` | `"delete_group"` | No (empty) | No |
Events are emitted synchronously after each successful database write inside `notify()`. `notify()` acquires a read-lock on `s.mu`, iterates watchers with non-blocking channel sends, then calls each registered callback. The read-lock allows multiple concurrent `notify()` calls to proceed simultaneously; `Watch`/`Unwatch`/`OnChange` take a write-lock when modifying the registry.
## Namespace Isolation (ScopedStore)
`ScopedStore` wraps a `*Store` and automatically prefixes all group names with `namespace + ":"`. This prevents key collisions when multiple tenants share a single underlying database.
```go
sc, _ := store.NewScoped(st, "tenant-42")
sc.Set("config", "theme", "dark")
// Stored in underlying store as group="tenant-42:config", key="theme"
```
Namespace strings must match `^[a-zA-Z0-9-]+$`. Invalid namespaces are rejected at construction time.
`ScopedStore` delegates all operations to the underlying `Store` after prefixing. Events emitted by scoped operations carry the full prefixed group name in `Event.Group`, enabling watchers on the underlying store to observe scoped mutations.
### Quota Enforcement
`NewScopedWithQuota(store, namespace, QuotaConfig)` adds per-namespace limits:
```go
type QuotaConfig struct {
MaxKeys int // maximum total keys across all groups in the namespace
MaxGroups int // maximum distinct groups in the namespace
}
```
Zero values mean unlimited. Before each `Set` or `SetWithTTL`, the scoped store:
1. Checks whether the key already exists (upserts never consume quota).
2. If the key is new, queries `CountAll(namespace + ":")` and compares against `MaxKeys`.
3. If the group is new (current count for that group is zero), queries `Groups(namespace + ":")` and compares against `MaxGroups`.
Exceeding a limit returns `ErrQuotaExceeded`.
## Concurrency Model
All SQLite access is serialised through a single connection (`SetMaxOpenConns(1)`). The store's watcher/callback registry is protected by a separate `sync.RWMutex` (`s.mu`). These two locks do not interact:
- DB writes acquire no application-level lock.
- `notify()` acquires `s.mu` (read) after the DB write completes.
- `Watch`/`Unwatch`/`OnChange` acquire `s.mu` (write) to modify the registry.
All operations are safe to call from multiple goroutines concurrently. The race detector is clean under the project's standard test suite (`go test -race ./...`).
## File Layout
```
store.go Core Store type, CRUD operations, TTL, background purge
events.go EventType, Event, Watcher, OnChange, notify
scope.go ScopedStore, QuotaConfig
store_test.go Tests: CRUD, TTL, concurrency, edge cases, benchmarks
events_test.go Tests: Watch, Unwatch, OnChange, event dispatch
scope_test.go Tests: namespace isolation, quota enforcement
coverage_test.go Tests: error paths for defensive code (scan errors, corruption)
bench_test.go Additional benchmarks
```

157
docs/development.md Normal file
View file

@ -0,0 +1,157 @@
# Development Guide — go-store
## Prerequisites
- Go 1.25 or later
- No CGO required (`modernc.org/sqlite` is a pure-Go SQLite implementation)
- No external tools beyond the Go toolchain
## Build and Test
The package is a standard Go module. All standard `go` commands apply.
```bash
# Run all tests
go test ./...
# Run with the race detector (required before any commit touching concurrency)
go test -race ./...
# Run a single test by name
go test -v -run TestWatch_Good_SpecificKey ./...
# Run tests with coverage
go test -cover ./...
# Generate a coverage profile and view it in the browser
go test -coverprofile=coverage.out ./...
go tool cover -html=coverage.out
# Run benchmarks
go test -bench=. -benchmem ./...
# Run a specific benchmark
go test -bench=BenchmarkSet -benchmem ./...
```
**Coverage target: 95%.** The remaining uncovered lines are defensive error paths (scan errors, rows iteration errors on corrupted databases) covered by `coverage_test.go`. Do not remove these checks to chase coverage — they protect against driver and OS-level failures that are not reachable through integration tests against a healthy SQLite database.
## Test Patterns
Tests follow the `_Good`, `_Bad`, `_Ugly` suffix convention used across the Core Go ecosystem:
- `_Good` — happy-path behaviour, including edge cases that should succeed
- `_Bad` — expected error conditions (closed store, invalid input, quota exceeded)
- `_Ugly` — not currently used in this package (reserved for panic/edge cases)
Tests are grouped into sections by the method under test, marked with comment banners:
```go
// ---------------------------------------------------------------------------
// Watch — specific key
// ---------------------------------------------------------------------------
func TestWatch_Good_SpecificKey(t *testing.T) { ... }
func TestWatch_Good_WildcardKey(t *testing.T) { ... }
```
**In-memory stores for unit tests.** Use `New(":memory:")` for all tests that do not require persistence. In-memory stores are faster and leave no filesystem artefacts.
**File-backed stores for concurrency and persistence tests.** Use `filepath.Join(t.TempDir(), "name.db")` for tests that verify WAL mode, persistence across open/close cycles, or concurrent writes. `t.TempDir()` is cleaned up automatically at the end of the test.
**Avoid `time.Sleep` except for TTL tests.** TTL expiry tests require short sleeps. Use the minimum duration that reliably demonstrates expiry (typically 1ms TTL + 5ms sleep). Do not use sleeps as synchronisation barriers for goroutines — use `sync.WaitGroup` instead.
**Race detector is mandatory for all concurrency tests.** Run `go test -race ./...` before marking any concurrency work as complete. The test suite must be clean under the race detector.
**Testify assertions.** The test suite uses `github.com/stretchr/testify/assert` and `github.com/stretchr/testify/require`. Use `require` for preconditions (test should abort on failure) and `assert` for verifications (test should continue and report all failures).
## Coding Standards
### Language
Use UK English throughout all documentation, comments, and error messages. This applies to spelling (colour, organisation, behaviour, serialise, initialise) and terminology. American spellings are not acceptable.
### Code Style
- `gofmt` formatting is mandatory. Run `go fmt ./...` before committing.
- `go vet ./...` must report no warnings.
- All error strings begin with the package function context: `"store.Method: what failed"`. This convention makes errors self-identifying in log output without requiring a stack trace.
- Exported identifiers must have Go doc comments.
- Internal helpers (unexported) should have comments explaining non-obvious behaviour.
### Licence Header
All source files must include the EUPL-1.2 licence identifier. The licence is specified in the module metadata and applies to all contributions.
### Dependencies
go-store is intentionally minimal. Before adding any new dependency:
1. Verify it cannot be replaced with a standard library alternative.
2. Verify it is pure Go (no CGO) to preserve cross-compilation.
3. Verify it has a compatible open-source licence (EUPL-1.2 compatible).
The only permitted runtime dependency is `modernc.org/sqlite`. Test-only dependencies (`github.com/stretchr/testify`) are acceptable.
## Commit Guidelines
Use conventional commit format:
```
type(scope): description
```
Common types: `feat`, `fix`, `test`, `docs`, `refactor`, `perf`, `chore`.
Examples:
```
feat(store): add PurgeExpired public method
fix(events): prevent deadlock when callback calls store methods
test(scope): add quota enforcement for new groups
docs(architecture): document WAL single-connection constraint
perf(store): replace linear watcher scan with index lookup
```
Every commit must include the co-author trailer:
```
Co-Authored-By: Virgil <virgil@lethean.io>
```
All tests must pass before committing:
```bash
go test -race ./...
go vet ./...
```
## Benchmarks and Performance
Reference benchmark results (Apple M-series, in-memory store):
```
BenchmarkSet-32 119280 10290 ns/op 328 B/op 12 allocs/op
BenchmarkGet-32 335707 3589 ns/op 576 B/op 21 allocs/op
BenchmarkGetAll-32 (10K keys) 258 4741451 ns/op 2268787 B/op 80095 allocs/op
BenchmarkSet_FileBacked-32 4525 265868 ns/op 327 B/op 12 allocs/op
```
Derived throughput:
- In-memory `Set`: approximately 97,000 ops/sec
- In-memory `Get`: approximately 279,000 ops/sec
- File-backed `Set`: approximately 3,800 ops/sec (dominated by fsync)
- `GetAll` with 10,000 keys: approximately 2.3 MB allocated per call
`GetAll` allocations scale linearly with the number of keys (one map entry per row). Applications fetching very large groups should consider pagination at a higher layer or restructuring data into multiple smaller groups.
## Adding a New Method
1. Implement the method on `*Store` in `store.go` (or `scope.go` if it is namespace-scoped).
2. If it is a mutating operation, call `s.notify(Event{...})` after the successful database write.
3. Add a corresponding delegation method to `ScopedStore` in `scope.go` that prefixes the group.
4. Write tests covering the happy path, error conditions, and closed-store behaviour.
5. Update quota checks in `checkQuota` if the operation affects key or group counts.
6. Run `go test -race ./...` and `go vet ./...`.
7. Update `docs/architecture.md` if the method introduces a new concept or changes an existing one.

178
docs/history.md Normal file
View file

@ -0,0 +1,178 @@
# Project History — go-store
## Origin
Extracted from `forge.lthn.ai/core/go` (`pkg/store/`) on 19 February 2026 by Virgil. The extraction gave the package its own module path (`forge.lthn.ai/core/go-store`), its own repository, and independent versioning.
At extraction the package comprised a single source file and a single test file. It provided basic CRUD with group namespacing and template rendering but had no TTL, no namespace isolation, and no event system. Test coverage was 73.1%.
---
## Phase 0 — Hardening and Test Coverage
**Agent:** Charon
**Completed:** 2026-02-20
### Concurrency fix: SQLITE_BUSY under contention
**Problem.** The `database/sql` connection pool hands out different physical connections for each `Exec` or `Query` call. SQLite pragmas (`PRAGMA journal_mode=WAL`, `PRAGMA busy_timeout`) are per-connection. Under concurrent write load (10 goroutines, 100 ops each), connections from the pool that had not received the WAL pragma would block and return `SQLITE_BUSY` immediately rather than waiting.
**Fix.** `db.SetMaxOpenConns(1)` serialises all database access through a single connection. Because SQLite is a single-writer database by design (it serialises writes at the file-lock level regardless of pool size), this does not reduce write throughput. It eliminates the BUSY errors by ensuring the pragma settings always apply.
**Defence in depth.** `PRAGMA busy_timeout=5000` was added to make the single connection wait up to 5 seconds before reporting a timeout error, providing additional resilience.
### Extended test coverage
Added tests for:
- Concurrent read/write with 10 goroutines and the race detector
- `Render()` with invalid template syntax (parse error)
- `Render()` with template execution error (calling a string as a function)
- `Get()` on a non-existent group vs. a non-existent key
- `DeleteGroup()` followed by `GetAll()` returning empty
- `Count()` after 500 bulk inserts
- In-memory vs. file-backed store (persistence across open/close)
- WAL mode verification via `PRAGMA journal_mode` query
- Edge cases: empty key, empty value, empty group, 10K-character key, binary-like values with null bytes, Unicode (accented, CJK, Arabic), SQL injection attempts, special characters
Coverage: 73.1% to 90.9%.
The remaining 9.1% comprised defensive error paths in `New()`, `GetAll()`, and `Render()` that are unreachable through integration tests against a healthy SQLite database (driver initialisation failures, scan errors on NULL columns, rows iteration errors on corrupted pages).
### Benchmarks
Established baseline benchmark results:
```
BenchmarkSet-32 119280 10290 ns/op
BenchmarkGet-32 335707 3589 ns/op
BenchmarkGetAll-32 (10K keys) 258 4741451 ns/op
BenchmarkSet_FileBacked-32 4525 265868 ns/op
```
`go vet ./...` made clean. No warnings.
---
## Phase 1 — TTL Support
**Agent:** Charon
**Completed:** 2026-02-20
Added optional time-to-live for keys.
### Changes
- `expires_at INTEGER` nullable column added to the `kv` schema.
- `SetWithTTL(group, key, value string, ttl time.Duration)` stores the current time plus TTL as a Unix millisecond timestamp in `expires_at`.
- `Get()` performs lazy deletion: if a key is found with an `expires_at` in the past, it is deleted and `ErrNotFound` is returned.
- `Count()`, `GetAll()`, and `Render()` include `(expires_at IS NULL OR expires_at > ?)` in all queries, excluding expired keys from results.
- `PurgeExpired()` public method deletes all physically stored expired rows and returns the count removed.
- Background goroutine calls `PurgeExpired()` every 60 seconds, controlled by a `context.WithCancel` that is cancelled on `Close()`.
- `Set()` clears any existing TTL when overwriting a key (sets `expires_at = NULL`).
- Schema migration: `ALTER TABLE kv ADD COLUMN expires_at INTEGER` runs on `New()`. The "duplicate column" error on already-upgraded databases is silently ignored.
### Tests added
TTL functionality tests covering: normal expiry on `Get`, exclusion from `Count`/`GetAll`/`Render`, `SetWithTTL` upsert, plain `Set` clearing TTL, future TTL remaining accessible, background purge goroutine, concurrent TTL access with 10 goroutines, schema migration from a pre-TTL database (manually constructed without `expires_at`).
Coverage: 90.9% to 94.7%.
---
## Phase 2 — Namespace Isolation
**Agent:** Charon
**Completed:** 2026-02-20
Added `ScopedStore` for multi-tenant namespace isolation.
### Changes
- `scope.go` introduced with `ScopedStore` wrapping `*Store`.
- Namespace strings validated against `^[a-zA-Z0-9-]+$`.
- `NewScoped(store, namespace)` constructor.
- All `Store` methods delegated with group automatically prefixed as `namespace + ":" + group`.
- `QuotaConfig{MaxKeys, MaxGroups int}` struct; zero means unlimited.
- `NewScopedWithQuota(store, namespace, quota)` constructor.
- `ErrQuotaExceeded` sentinel error.
- `checkQuota(group, key)` internal method: skips upserts (existing key), checks `CountAll(namespace+":")` against `MaxKeys`, checks `Groups(namespace+":")` against `MaxGroups` only when the group is new.
- `CountAll(prefix string)` added to `Store`: counts non-expired keys across all groups matching a prefix. Empty prefix counts across all groups.
- `Groups(prefix string)` added to `Store`: returns distinct non-expired group names matching a prefix. Empty prefix returns all groups.
### Tests added
ScopedStore basic CRUD, cross-namespace isolation, `GetAll` scoping. Quota tests: MaxKeys limit, upsert does not count, delete and re-insert stays within quota, MaxGroups limit, zero quota is unlimited, TTL-expired keys do not count towards quota. `CountAll` and `Groups` tests with mixed namespaces and expired key exclusion.
Coverage: 94.7% to 95.5% (approximate; coverage_test.go added to cover defensive paths).
---
## Phase 3 — Event Hooks
**Agent:** Charon
**Completed:** 2026-02-20
Added a reactive notification system for store mutations.
### Changes
- `events.go` introduced with `EventType` (`EventSet`, `EventDelete`, `EventDeleteGroup`), `Event` struct, `Watcher` struct, `callbackEntry` struct.
- `watcherBufSize = 16` constant.
- `Watch(group, key string) *Watcher`: creates a buffered channel watcher. Wildcard `"*"` supported for both group and key. Uses `atomic.AddUint64` for monotonic watcher IDs.
- `Unwatch(w *Watcher)`: removes watcher from the registry and closes its channel. Idempotent.
- `OnChange(fn func(Event)) func()`: registers a synchronous callback. Returns an idempotent unregister function using `sync.Once`.
- `notify(e Event)`: internal dispatch. Acquires read-lock on `s.mu`; non-blocking send to each matching watcher channel (drop-on-full); calls each callback synchronously. Separate `watcherMatches` helper handles wildcard logic.
- `Set()`, `SetWithTTL()`, `Delete()`, `DeleteGroup()` each call `notify()` after the successful database write.
- `Store` struct extended with `watchers []*Watcher`, `callbacks []callbackEntry`, `mu sync.RWMutex`, `nextID uint64`.
- ScopedStore mutations automatically emit events with the full prefixed group name — no extra implementation required.
### Tests added
Specific-key watcher receives matching events and ignores non-matching keys. Wildcard-key watcher receives all keys in a group. Global wildcard `("*", "*")` receives all mutations across all groups. `Unwatch` stops delivery and closes the channel. `Unwatch` is idempotent. Delete and DeleteGroup emit correct event types with correct populated fields. `OnChange` callback fires on Set and Delete. `OnChange` unregister stops future invocations (idempotent). Buffer-full (32 writes against cap-16 channel) does not block the writer. Multiple watchers on the same key receive events independently. Concurrent Watch/Unwatch during concurrent writes (race test, 10 goroutines). `ScopedStore` events carry the prefixed group name. `SetWithTTL` emits `EventSet`. `EventType.String()` returns correct labels including `"unknown"` for undefined values.
Coverage: 94.7% to 95.5%.
---
## Coverage Test Suite
`coverage_test.go` exercises defensive error paths that integration tests cannot reach through normal usage:
- Schema conflict: pre-existing SQLite index named `kv` causes `New()` to return `store.New: schema: ...`.
- `GetAll` scan error: NULL key in a row (requires manually altering the schema to remove the NOT NULL constraint).
- `GetAll` rows iteration error: physically corrupting database pages mid-file to trigger `rows.Err()` during multi-page scans.
- `Render` scan error: same NULL-key technique.
- `Render` rows iteration error: same corruption technique.
These tests exercise correct defensive code. They must continue to pass but are not indicative of real failure modes in production.
---
## Known Limitations
**Single writer.** `SetMaxOpenConns(1)` serialises all access through one connection. Write throughput is bounded by SQLite's single-writer architecture. This is appropriate for the intended use cases (configuration storage, session state, per-tenant key-value data) but is not suitable for high-throughput append-only workloads.
**File-backed write throughput.** File-backed `Set` operations (~3,800 ops/sec on Apple M-series) are dominated by fsync. Applications writing at higher rates should use in-memory stores or consider WAL checkpoint tuning.
**`GetAll` memory usage.** Fetching a group with 10,000 keys allocates approximately 2.3 MB per call. There is no pagination API. Applications with very large groups should restructure data into smaller groups or query selectively.
**No cross-group transactions.** There is no API for atomic multi-group operations. Each method is individually atomic at the SQLite level, but there is no `Begin`/`Commit` exposed to callers.
**No wildcard deletes.** There is no `DeletePrefix` or pattern-based delete. To delete all groups under a namespace, callers must retrieve the group list via `Groups()` and delete each individually.
**Callback deadlock risk.** `OnChange` callbacks run synchronously in the writer's goroutine while holding `s.mu` (read). Calling any `Store` method that calls `notify()` from within a callback will attempt to re-acquire `s.mu` (read), which is permitted with a read-lock but calling `Watch`/`Unwatch`/`OnChange` within a callback will deadlock (they require a write-lock). Document this constraint prominently in callback usage.
**No persistence of watcher registrations.** Watchers and callbacks are in-memory only. They are not persisted across `Close`/`New` cycles.
---
## Future Considerations
These are design notes, not committed work:
- **Pagination for `GetAll`.** A `GetPage(group string, offset, limit int)` method would support large groups without full in-memory materialisation.
- **Indexed prefix keys.** An additional index on `(grp, key)` prefix would accelerate prefix scans without a full-table scan.
- **TTL background purge interval as constructor option.** Currently only settable by mutating `s.purgeInterval` directly in tests. A `WithPurgeInterval(d time.Duration)` functional option would make this part of the public API.
- **Cross-group atomic operations.** Exposing a `Transaction(func(tx *StoreTx) error)` API would allow callers to compose atomic multi-group operations.
- **`DeletePrefix(prefix string)` method.** Would enable efficient cleanup of an entire namespace without first listing groups.