From 6e418efb485452df010efa1f27d4aa6f6fb167bb Mon Sep 17 00:00:00 2001 From: Snider Date: Wed, 11 Mar 2026 13:02:40 +0000 Subject: [PATCH] docs: add human-friendly documentation Co-Authored-By: Claude Opus 4.6 --- docs/architecture.md | 219 +++++++++++++++++++++++++++++++++++++++++++ docs/development.md | 188 +++++++++++++++++++++++++++++++++++++ docs/index.md | 111 ++++++++++++++++++++++ 3 files changed, 518 insertions(+) create mode 100644 docs/architecture.md create mode 100644 docs/development.md create mode 100644 docs/index.md diff --git a/docs/architecture.md b/docs/architecture.md new file mode 100644 index 0000000..51e0e01 --- /dev/null +++ b/docs/architecture.md @@ -0,0 +1,219 @@ +--- +title: Architecture +description: Internals of go-cache -- types, data flow, storage format, and security model. +--- + +# Architecture + +This document explains how `go-cache` works internally, covering its type +system, on-disc format, data flow, and security considerations. + + +## Core Types + +### Cache + +```go +type Cache struct { + medium io.Medium + baseDir string + ttl time.Duration +} +``` + +`Cache` is the primary handle. It holds: + +- **medium** -- the storage backend (any `io.Medium` implementation). +- **baseDir** -- the root directory under which all cache files live. +- **ttl** -- how long entries remain valid after being written. + +All three fields are set once during construction via `cache.New()` and are +immutable for the lifetime of the instance. + + +### Entry + +```go +type Entry struct { + Data json.RawMessage `json:"data"` + CachedAt time.Time `json:"cached_at"` + ExpiresAt time.Time `json:"expires_at"` +} +``` + +`Entry` is the envelope written to storage. It wraps the caller's data as raw +JSON and adds two timestamps for expiry tracking. Using `json.RawMessage` means +the data payload is stored verbatim -- no intermediate deserialisation happens +during writes. + + +## Constructor Defaults + +`cache.New(medium, baseDir, ttl)` applies sensible defaults when arguments are +zero-valued: + +| Parameter | Zero value | Default applied | +|-----------|--------------|---------------------------------------------| +| `medium` | `nil` | `io.Local` (unsandboxed local filesystem) | +| `baseDir` | `""` | `.core/cache/` relative to the working dir | +| `ttl` | `0` | `cache.DefaultTTL` (1 hour) | + +The constructor also calls `medium.EnsureDir(baseDir)` to guarantee the cache +directory exists before any reads or writes. + + +## Data Flow + +### Writing (`Set`) + +``` +caller data + | + v +json.Marshal(data) -- serialise caller's value + | + v +wrap in Entry{ -- add timestamps + Data: , + CachedAt: time.Now(), + ExpiresAt: time.Now().Add(ttl), +} + | + v +json.MarshalIndent(entry) -- human-readable JSON + | + v +medium.Write(path, string) -- persist via the storage backend +``` + +The resulting file on disc (or equivalent record in another medium) looks like: + +```json +{ + "data": { "foo": "bar" }, + "cached_at": "2026-03-10T14:30:00Z", + "expires_at": "2026-03-10T15:30:00Z" +} +``` + +Parent directories for nested keys (e.g. `github/host-uk/repos`) are created +automatically via `medium.EnsureDir()`. + + +### Reading (`Get`) + +``` +medium.Read(path) + | + v +json.Unmarshal -> Entry -- parse the envelope + | + v +time.Now().After(ExpiresAt)? -- check TTL + | | + yes no + | | + v v +return false json.Unmarshal(entry.Data, dest) +(cache miss) | + v + return true + (cache hit) +``` + +Key behaviours: + +- If the file does not exist (`os.ErrNotExist`), `Get` returns `(false, nil)` -- + a miss, not an error. +- If the file contains invalid JSON, it is treated as a miss (not an error). + This prevents corrupted files from blocking the caller. +- If the entry exists but has expired, it is treated as a miss. The stale file + is **not** deleted eagerly -- it remains on disc until explicitly removed or + overwritten. + + +### Deletion + +- **`Delete(key)`** removes a single entry. If the file does not exist, the + operation succeeds silently. +- **`Clear()`** calls `medium.DeleteAll(baseDir)`, removing the entire cache + directory and all its contents. + + +### Age Inspection + +`Age(key)` returns the `time.Duration` since the entry was written (`CachedAt`). +If the entry does not exist or cannot be parsed, it returns `-1`. This is useful +for diagnostics without triggering the expiry check that `Get` performs. + + +## Key-to-Path Mapping + +Cache keys are mapped to file paths by appending `.json` and joining with the +base directory: + +``` +key: "github/host-uk/repos" +path: /github/host-uk/repos.json +``` + +Keys may contain forward slashes to create a directory hierarchy. This is how +the GitHub key helpers work: + +```go +func GitHubReposKey(org string) string { + return filepath.Join("github", org, "repos") +} + +func GitHubRepoKey(org, repo string) string { + return filepath.Join("github", org, repo, "meta") +} +``` + + +## Security: Path Traversal Prevention + +The `Path()` method guards against directory traversal attacks. After computing +the full path, it resolves both the base directory and the result to absolute +paths, then checks that the result is still a prefix of the base: + +```go +if !strings.HasPrefix(absPath, absBase) { + return "", fmt.Errorf("invalid cache key: path traversal attempt") +} +``` + +This means a key like `../../etc/passwd` will be rejected before any I/O +occurs. Every public method (`Get`, `Set`, `Delete`, `Age`) calls `Path()` +internally, so traversal protection is always active. + + +## Concurrency + +The `Cache` struct does not include a mutex. Concurrent reads are safe (each +call does independent file I/O), but concurrent writes to the **same key** may +produce a race at the filesystem level. If your application writes to the same +key from multiple goroutines, protect the call site with your own +synchronisation. + +In practice, caches in this ecosystem are typically written by a single +goroutine (e.g. a CLI command fetching GitHub data) and read by others, which +avoids contention. + + +## Relationship to go-io + +`go-cache` delegates all storage operations to the `io.Medium` interface from +`go-io`. It uses only five methods: + +| Method | Used by | +|--------------|---------------------| +| `EnsureDir` | `New`, `Set` | +| `Read` | `Get`, `Age` | +| `Write` | `Set` | +| `Delete` | `Delete` | +| `DeleteAll` | `Clear` | + +This minimal surface makes it straightforward to swap storage backends. For +tests, `io.NewMockMedium()` provides a fully in-memory implementation with no +disc access. diff --git a/docs/development.md b/docs/development.md new file mode 100644 index 0000000..841f25a --- /dev/null +++ b/docs/development.md @@ -0,0 +1,188 @@ +--- +title: Development +description: Building, testing, and contributing to go-cache. +--- + +# Development + +This guide covers how to build, test, and contribute to `go-cache`. + + +## Prerequisites + +- **Go 1.26** or later +- Access to `forge.lthn.ai` modules (`GOPRIVATE=forge.lthn.ai/*`) +- The `core` CLI (optional, for `core go test` and `core go qa`) + + +## Getting the Source + +```bash +git clone ssh://git@forge.lthn.ai:2223/core/go-cache.git +cd go-cache +``` + +If you are working within the Go workspace at `~/Code/go.work`, the module is +already available locally and dependency resolution will use workspace overrides. + + +## Running Tests + +With the `core` CLI: + +```bash +core go test +``` + +With plain Go: + +```bash +go test ./... +``` + +To run a single test: + +```bash +core go test --run TestCache +# or +go test -run TestCache ./... +``` + +The test suite uses `io.NewMockMedium()` for all storage operations, so no +files are written to disc and tests run quickly in any environment. + + +## Test Coverage + +```bash +core go cov # Generate coverage report +core go cov --open # Generate and open in browser +``` + + +## Code Quality + +The full QA pipeline runs formatting, vetting, linting, and tests in one +command: + +```bash +core go qa # fmt + vet + lint + test +core go qa full # adds race detector, vulnerability scan, security audit +``` + +Individual steps: + +```bash +core go fmt # Format with gofmt +core go vet # Static analysis +core go lint # Linter checks +``` + + +## Project Structure + +``` +go-cache/ + .core/ + build.yaml # Build configuration (targets, flags) + release.yaml # Release configuration (changelog rules) + cache.go # Package source + cache_test.go # Tests + go.mod # Module definition + go.sum # Dependency checksums + docs/ # This documentation +``` + +The package is intentionally small -- a single source file and a single test +file. There are no sub-packages. + + +## Writing Tests + +Tests follow the standard Go testing conventions. The codebase uses +`testing.T` directly (not testify assertions) for simplicity. When adding tests: + +1. Use `io.NewMockMedium()` rather than the real filesystem. +2. Keep TTLs short (milliseconds) when testing expiry behaviour. +3. Name test functions descriptively: `TestCacheExpiry`, `TestCacheDefaults`, etc. + +Example of testing cache expiry: + +```go +func TestCacheExpiry(t *testing.T) { + m := io.NewMockMedium() + c, err := cache.New(m, "/tmp/test", 10*time.Millisecond) + if err != nil { + t.Fatalf("failed to create cache: %v", err) + } + + c.Set("key", "value") + time.Sleep(50 * time.Millisecond) + + var result string + found, _ := c.Get("key", &result) + if found { + t.Error("expected expired entry to be a cache miss") + } +} +``` + + +## Commit Conventions + +This project uses conventional commits: + +``` +feat(cache): add batch eviction support +fix(cache): handle corrupted JSON gracefully +refactor: simplify Path() traversal check +``` + +The release configuration (`.core/release.yaml`) includes `feat`, `fix`, +`perf`, and `refactor` in changelogs, and excludes `chore`, `docs`, `style`, +`test`, and `ci`. + + +## Build Configuration + +The `.core/build.yaml` defines cross-compilation targets: + +| OS | Architecture | +|---------|-------------| +| Linux | amd64 | +| Linux | arm64 | +| Darwin | arm64 | +| Windows | amd64 | + +Since `go-cache` is a library (no `main` package), the build configuration is +primarily used by the CI pipeline for compilation checks rather than producing +binaries. + + +## Adding a New Storage Backend + +To use the cache with a different storage medium, implement the `io.Medium` +interface from `go-io` and pass it to `cache.New()`. The cache only requires +five methods: `EnsureDir`, `Read`, `Write`, `Delete`, and `DeleteAll`. See +the [architecture](architecture.md) document for the full method mapping. + +```go +import ( + "forge.lthn.ai/core/go-cache" + "forge.lthn.ai/core/go-io/store" + "time" +) + +// Use SQLite as the cache backend +medium, err := store.NewMedium("/path/to/cache.db") +if err != nil { + panic(err) +} + +c, err := cache.New(medium, "cache", 30*time.Minute) +``` + + +## Licence + +EUPL-1.2. See the repository root for the full licence text. diff --git a/docs/index.md b/docs/index.md new file mode 100644 index 0000000..76cbfe8 --- /dev/null +++ b/docs/index.md @@ -0,0 +1,111 @@ +--- +title: go-cache +description: File-based caching with TTL expiry, storage-agnostic via the go-io Medium interface. +--- + +# go-cache + +`go-cache` is a lightweight, storage-agnostic caching library for Go. It stores +JSON-serialised entries with automatic TTL expiry and path-traversal protection. + +**Module path:** `forge.lthn.ai/core/go-cache` + +**Licence:** EUPL-1.2 + + +## Quick Start + +```go +import ( + "fmt" + "time" + + "forge.lthn.ai/core/go-cache" +) + +func main() { + // Create a cache with default settings: + // - storage: local filesystem (io.Local) + // - directory: .core/cache/ in the working directory + // - TTL: 1 hour + c, err := cache.New(nil, "", 0) + if err != nil { + panic(err) + } + + // Store a value + err = c.Set("user/profile", map[string]string{ + "name": "Alice", + "role": "admin", + }) + if err != nil { + panic(err) + } + + // Retrieve it (returns false if missing or expired) + var profile map[string]string + found, err := c.Get("user/profile", &profile) + if err != nil { + panic(err) + } + if found { + fmt.Println(profile["name"]) // Alice + } +} +``` + + +## Package Layout + +| File | Purpose | +|-----------------|-------------------------------------------------------------| +| `cache.go` | Core types (`Cache`, `Entry`), CRUD operations, key helpers | +| `cache_test.go` | Tests covering set/get, expiry, delete, clear, defaults | +| `go.mod` | Module definition (Go 1.26) | + + +## Dependencies + +| Module | Version | Role | +|-------------------------------|---------|---------------------------------------------| +| `forge.lthn.ai/core/go-io` | v0.0.3 | Storage abstraction (`Medium` interface) | +| `forge.lthn.ai/core/go-log` | v0.0.1 | Structured logging (indirect, via `go-io`) | + +There are no other runtime dependencies. The test suite uses the standard +library only (plus the `MockMedium` from `go-io`). + + +## Key Concepts + +### Storage Backends + +The cache does not read or write files directly. All I/O goes through the +`io.Medium` interface defined in `go-io`. This means the same cache logic works +against: + +- **Local filesystem** (`io.Local`) -- the default +- **SQLite KV store** (`store.Medium` from `go-io/store`) +- **S3-compatible storage** (`go-io/s3`) +- **In-memory mock** (`io.NewMockMedium()`) -- ideal for tests + +Pass any `Medium` implementation as the first argument to `cache.New()`. + +### TTL and Expiry + +Every entry records both `cached_at` and `expires_at` timestamps. On `Get()`, +if the current time is past `expires_at`, the entry is treated as a cache miss +-- no stale data is ever returned. The default TTL is one hour +(`cache.DefaultTTL`). + +### GitHub Cache Keys + +The package includes two helper functions that produce consistent cache keys +for GitHub API data: + +```go +cache.GitHubReposKey("host-uk") // "github/host-uk/repos" +cache.GitHubRepoKey("host-uk", "core") // "github/host-uk/core/meta" +``` + +These are convenience helpers used by other packages in the ecosystem (such as +`go-devops`) to avoid key duplication when caching GitHub responses.