Replace internal task tracking (TODO.md, FINDINGS.md) with structured documentation in docs/. Trim CLAUDE.md to agent instructions only. Co-Authored-By: Virgil <virgil@lethean.io>
11 KiB
Project History — go-store
Origin
Extracted from forge.lthn.ai/core/go (pkg/store/) on 19 February 2026 by Virgil. The extraction gave the package its own module path (forge.lthn.ai/core/go-store), its own repository, and independent versioning.
At extraction the package comprised a single source file and a single test file. It provided basic CRUD with group namespacing and template rendering but had no TTL, no namespace isolation, and no event system. Test coverage was 73.1%.
Phase 0 — Hardening and Test Coverage
Agent: Charon Completed: 2026-02-20
Concurrency fix: SQLITE_BUSY under contention
Problem. The database/sql connection pool hands out different physical connections for each Exec or Query call. SQLite pragmas (PRAGMA journal_mode=WAL, PRAGMA busy_timeout) are per-connection. Under concurrent write load (10 goroutines, 100 ops each), connections from the pool that had not received the WAL pragma would block and return SQLITE_BUSY immediately rather than waiting.
Fix. db.SetMaxOpenConns(1) serialises all database access through a single connection. Because SQLite is a single-writer database by design (it serialises writes at the file-lock level regardless of pool size), this does not reduce write throughput. It eliminates the BUSY errors by ensuring the pragma settings always apply.
Defence in depth. PRAGMA busy_timeout=5000 was added to make the single connection wait up to 5 seconds before reporting a timeout error, providing additional resilience.
Extended test coverage
Added tests for:
- Concurrent read/write with 10 goroutines and the race detector
Render()with invalid template syntax (parse error)Render()with template execution error (calling a string as a function)Get()on a non-existent group vs. a non-existent keyDeleteGroup()followed byGetAll()returning emptyCount()after 500 bulk inserts- In-memory vs. file-backed store (persistence across open/close)
- WAL mode verification via
PRAGMA journal_modequery - Edge cases: empty key, empty value, empty group, 10K-character key, binary-like values with null bytes, Unicode (accented, CJK, Arabic), SQL injection attempts, special characters
Coverage: 73.1% to 90.9%.
The remaining 9.1% comprised defensive error paths in New(), GetAll(), and Render() that are unreachable through integration tests against a healthy SQLite database (driver initialisation failures, scan errors on NULL columns, rows iteration errors on corrupted pages).
Benchmarks
Established baseline benchmark results:
BenchmarkSet-32 119280 10290 ns/op
BenchmarkGet-32 335707 3589 ns/op
BenchmarkGetAll-32 (10K keys) 258 4741451 ns/op
BenchmarkSet_FileBacked-32 4525 265868 ns/op
go vet ./... made clean. No warnings.
Phase 1 — TTL Support
Agent: Charon Completed: 2026-02-20
Added optional time-to-live for keys.
Changes
expires_at INTEGERnullable column added to thekvschema.SetWithTTL(group, key, value string, ttl time.Duration)stores the current time plus TTL as a Unix millisecond timestamp inexpires_at.Get()performs lazy deletion: if a key is found with anexpires_atin the past, it is deleted andErrNotFoundis returned.Count(),GetAll(), andRender()include(expires_at IS NULL OR expires_at > ?)in all queries, excluding expired keys from results.PurgeExpired()public method deletes all physically stored expired rows and returns the count removed.- Background goroutine calls
PurgeExpired()every 60 seconds, controlled by acontext.WithCancelthat is cancelled onClose(). Set()clears any existing TTL when overwriting a key (setsexpires_at = NULL).- Schema migration:
ALTER TABLE kv ADD COLUMN expires_at INTEGERruns onNew(). The "duplicate column" error on already-upgraded databases is silently ignored.
Tests added
TTL functionality tests covering: normal expiry on Get, exclusion from Count/GetAll/Render, SetWithTTL upsert, plain Set clearing TTL, future TTL remaining accessible, background purge goroutine, concurrent TTL access with 10 goroutines, schema migration from a pre-TTL database (manually constructed without expires_at).
Coverage: 90.9% to 94.7%.
Phase 2 — Namespace Isolation
Agent: Charon Completed: 2026-02-20
Added ScopedStore for multi-tenant namespace isolation.
Changes
scope.gointroduced withScopedStorewrapping*Store.- Namespace strings validated against
^[a-zA-Z0-9-]+$. NewScoped(store, namespace)constructor.- All
Storemethods delegated with group automatically prefixed asnamespace + ":" + group. QuotaConfig{MaxKeys, MaxGroups int}struct; zero means unlimited.NewScopedWithQuota(store, namespace, quota)constructor.ErrQuotaExceededsentinel error.checkQuota(group, key)internal method: skips upserts (existing key), checksCountAll(namespace+":")againstMaxKeys, checksGroups(namespace+":")againstMaxGroupsonly when the group is new.CountAll(prefix string)added toStore: counts non-expired keys across all groups matching a prefix. Empty prefix counts across all groups.Groups(prefix string)added toStore: returns distinct non-expired group names matching a prefix. Empty prefix returns all groups.
Tests added
ScopedStore basic CRUD, cross-namespace isolation, GetAll scoping. Quota tests: MaxKeys limit, upsert does not count, delete and re-insert stays within quota, MaxGroups limit, zero quota is unlimited, TTL-expired keys do not count towards quota. CountAll and Groups tests with mixed namespaces and expired key exclusion.
Coverage: 94.7% to 95.5% (approximate; coverage_test.go added to cover defensive paths).
Phase 3 — Event Hooks
Agent: Charon Completed: 2026-02-20
Added a reactive notification system for store mutations.
Changes
events.gointroduced withEventType(EventSet,EventDelete,EventDeleteGroup),Eventstruct,Watcherstruct,callbackEntrystruct.watcherBufSize = 16constant.Watch(group, key string) *Watcher: creates a buffered channel watcher. Wildcard"*"supported for both group and key. Usesatomic.AddUint64for monotonic watcher IDs.Unwatch(w *Watcher): removes watcher from the registry and closes its channel. Idempotent.OnChange(fn func(Event)) func(): registers a synchronous callback. Returns an idempotent unregister function usingsync.Once.notify(e Event): internal dispatch. Acquires read-lock ons.mu; non-blocking send to each matching watcher channel (drop-on-full); calls each callback synchronously. SeparatewatcherMatcheshelper handles wildcard logic.Set(),SetWithTTL(),Delete(),DeleteGroup()each callnotify()after the successful database write.Storestruct extended withwatchers []*Watcher,callbacks []callbackEntry,mu sync.RWMutex,nextID uint64.- ScopedStore mutations automatically emit events with the full prefixed group name — no extra implementation required.
Tests added
Specific-key watcher receives matching events and ignores non-matching keys. Wildcard-key watcher receives all keys in a group. Global wildcard ("*", "*") receives all mutations across all groups. Unwatch stops delivery and closes the channel. Unwatch is idempotent. Delete and DeleteGroup emit correct event types with correct populated fields. OnChange callback fires on Set and Delete. OnChange unregister stops future invocations (idempotent). Buffer-full (32 writes against cap-16 channel) does not block the writer. Multiple watchers on the same key receive events independently. Concurrent Watch/Unwatch during concurrent writes (race test, 10 goroutines). ScopedStore events carry the prefixed group name. SetWithTTL emits EventSet. EventType.String() returns correct labels including "unknown" for undefined values.
Coverage: 94.7% to 95.5%.
Coverage Test Suite
coverage_test.go exercises defensive error paths that integration tests cannot reach through normal usage:
- Schema conflict: pre-existing SQLite index named
kvcausesNew()to returnstore.New: schema: .... GetAllscan error: NULL key in a row (requires manually altering the schema to remove the NOT NULL constraint).GetAllrows iteration error: physically corrupting database pages mid-file to triggerrows.Err()during multi-page scans.Renderscan error: same NULL-key technique.Renderrows iteration error: same corruption technique.
These tests exercise correct defensive code. They must continue to pass but are not indicative of real failure modes in production.
Known Limitations
Single writer. SetMaxOpenConns(1) serialises all access through one connection. Write throughput is bounded by SQLite's single-writer architecture. This is appropriate for the intended use cases (configuration storage, session state, per-tenant key-value data) but is not suitable for high-throughput append-only workloads.
File-backed write throughput. File-backed Set operations (~3,800 ops/sec on Apple M-series) are dominated by fsync. Applications writing at higher rates should use in-memory stores or consider WAL checkpoint tuning.
GetAll memory usage. Fetching a group with 10,000 keys allocates approximately 2.3 MB per call. There is no pagination API. Applications with very large groups should restructure data into smaller groups or query selectively.
No cross-group transactions. There is no API for atomic multi-group operations. Each method is individually atomic at the SQLite level, but there is no Begin/Commit exposed to callers.
No wildcard deletes. There is no DeletePrefix or pattern-based delete. To delete all groups under a namespace, callers must retrieve the group list via Groups() and delete each individually.
Callback deadlock risk. OnChange callbacks run synchronously in the writer's goroutine while holding s.mu (read). Calling any Store method that calls notify() from within a callback will attempt to re-acquire s.mu (read), which is permitted with a read-lock but calling Watch/Unwatch/OnChange within a callback will deadlock (they require a write-lock). Document this constraint prominently in callback usage.
No persistence of watcher registrations. Watchers and callbacks are in-memory only. They are not persisted across Close/New cycles.
Future Considerations
These are design notes, not committed work:
- Pagination for
GetAll. AGetPage(group string, offset, limit int)method would support large groups without full in-memory materialisation. - Indexed prefix keys. An additional index on
(grp, key)prefix would accelerate prefix scans without a full-table scan. - TTL background purge interval as constructor option. Currently only settable by mutating
s.purgeIntervaldirectly in tests. AWithPurgeInterval(d time.Duration)functional option would make this part of the public API. - Cross-group atomic operations. Exposing a
Transaction(func(tx *StoreTx) error)API would allow callers to compose atomic multi-group operations. DeletePrefix(prefix string)method. Would enable efficient cleanup of an entire namespace without first listing groups.