docs: graduate TODO/FINDINGS into production documentation

Co-Authored-By: Virgil <virgil@lethean.io>
This commit is contained in:
Snider 2026-02-20 15:03:18 +00:00
parent 9996754cc4
commit 12ad23610f
6 changed files with 998 additions and 278 deletions

138
CLAUDE.md
View file

@ -1,19 +1,7 @@
# CLAUDE.md — go-devops Domain Expert Guide
# CLAUDE.md — go-devops Agent Instructions
You are a dedicated domain expert for `forge.lthn.ai/core/go-devops`. Virgil (in core/go) orchestrates your work via TODO.md. Pick up tasks in phase order, mark `[x]` when done, commit and push.
## What This Package Does
Infrastructure management, build automation, and release pipelines. ~29K LOC across 118 Go files. Provides:
- **Ansible engine** — Native Go playbook executor (not shelling out to `ansible-playbook`). SSH, modules, facts, handlers.
- **Build system** — Plugin-based builders (Go, Wails, Docker, C++, LinuxKit, Taskfile). Cross-compilation, code signing (macOS/GPG/Windows).
- **Release automation** — Version detection, changelog from git history, multi-target publishing (GitHub, Docker, Homebrew, AUR, Scoop, Chocolatey, npm).
- **Infrastructure APIs** — Hetzner Cloud, Hetzner Robot (bare metal), CloudNS DNS.
- **Container/VM management** — LinuxKit images on QEMU (Linux) or Hyperkit (macOS).
- **SDK generation** — OpenAPI spec parsing, TypeScript/Python/Go/PHP client generation, breaking change detection.
- **Developer toolkit** — Code quality metrics, TODO detection, coverage reports, dependency graphs.
## Commands
```bash
@ -29,103 +17,7 @@ go vet ./... # Static analysis
|--------|-----------|-------|
| `forge.lthn.ai/core/go` | `../core` | Framework (core.E, io.Medium, config, i18n, log) |
**Do NOT change the replace directive path.** Use go.work for local resolution if needed.
## Architecture
### ansible/ — Playbook Execution Engine (~3,162 LOC)
| File | LOC | Purpose |
|------|-----|---------|
| `executor.go` | 1,021 | Playbook runner: task/handler/fact tracking, become/sudo |
| `modules.go` | 1,434 | Module implementations: service, file, template, command, copy, apt, yum |
| `parser.go` | 438 | YAML playbook + inventory parser |
| `ssh.go` | 451 | SSH client connection management |
| `types.go` | 258 | Core types: Play, Task, Handler, Inventory, Facts |
Executes Ansible playbooks natively in Go. Supports: `when` conditionals, `register` variables, `notify` handlers, `become` privilege escalation, `loop` iteration, fact gathering.
### build/ — Project Building & Cross-Compilation (~3,637 LOC)
**Root** (797 LOC): Project type detection, archive creation (tar.gz/xz/zip via Borg compression), config from `.core/build.yaml`, SHA checksums.
**builders/** (1,390 LOC): Plugin interface `Builder.Build()`.
| Builder | LOC | Notes |
|---------|-----|-------|
| `go.go` | — | Go cross-compilation |
| `wails.go` | 247 | Wails desktop app |
| `docker.go` | 215 | Docker image build |
| `cpp.go` | 253 | CMake C++ |
| `linuxkit.go` | 270 | LinuxKit VM image |
| `taskfile.go` | 275 | Taskfile automation |
**signing/** (377 LOC): Signer interface. macOS `codesign`, GPG, Windows `signtool`.
**buildcmd/** (1,053 LOC): CLI handlers for `core build`, `core build pwa`, `core build sdk`, `core release`.
### container/ — LinuxKit VM Management (~1,208 LOC)
| File | LOC | Purpose |
|------|-----|---------|
| `container.go` | 106 | Manager interface + Container model |
| `linuxkit.go` | 462 | LinuxKitManager: Run, Stop, List |
| `hypervisor.go` | 273 | Abstraction: QEMU (Linux) / Hyperkit (macOS) |
| `state.go` | 172 | Container state persistence (`~/.core/state.json`) |
| `templates.go` | 301 | Packer/LinuxKit template rendering |
### devops/ — Portable Dev Environment (~1,216 LOC)
| File | LOC | Purpose |
|------|-----|---------|
| `devops.go` | 243 | Manager: install, boot, stop, status |
| `config.go` | 90 | Config from `~/.core/config.yaml` |
| `images.go` | 198 | ImageManager: download from GitHub/CDN/registry |
| `shell.go` | 74 | Shell execution wrapper |
| `test.go` | 188 | Test execution helpers |
| `serve.go` | 109 | Dev environment HTTP server |
| `claude.go` | 143 | Claude/AI integration |
| `ssh_utils.go` | 68 | SSH key scanning |
**sources/** (218 LOC): `ImageSource` interface. GitHub Releases + S3/CDN download sources.
### infra/ — Infrastructure APIs (~953 LOC)
| File | LOC | Purpose |
|------|-----|---------|
| `config.go` | 300 | `infra.yaml` types: Host, LoadBalancer, Network, DNS, Database, Cache |
| `hetzner.go` | 381 | Hetzner Cloud API (VPS) + Hetzner Robot API (bare metal) |
| `cloudns.go` | 272 | CloudNS DNS: zones, records, ACME DNS-01 challenges |
### release/ — Release Automation (~4,008 LOC)
**Root** (1,398 LOC): Release orchestrator (version → build → changelog → publish), config from `.core/release.yaml`, git-based changelog, semver detection.
**publishers/** (2,610 LOC): `Publisher` interface.
| Publisher | LOC | Notes |
|-----------|-----|-------|
| `github.go` | 233 | GitHub Releases |
| `docker.go` | 278 | Docker image build + push |
| `homebrew.go` | 371 | Homebrew formula |
| `npm.go` | 265 | npm registry |
| `aur.go` | 313 | Arch Linux AUR |
| `scoop.go` | 284 | Windows Scoop |
| `chocolatey.go` | 294 | Windows Chocolatey |
| `linuxkit.go` | 300 | LinuxKit image |
### sdk/ — OpenAPI SDK Generation (~931 LOC)
Auto-detect OpenAPI spec, generate typed clients in 4 languages, detect breaking changes via oasdiff.
**generators/** (437 LOC): TypeScript, Python, Go, PHP generators.
### devkit/ — Developer Toolkit (~560 LOC)
Code quality analysis: TODOs/FIXMEs, coverage reports, race conditions, vulnerability detection, secret leak scanning, cyclomatic complexity, dependency graphs.
### deploy/ — Deployment Integrations (~366 LOC)
- **python/** — Embedded Python 3.13 runtime (kluctl/go-embed-python)
- **coolify/** — Coolify PaaS API client via Python Swagger
**Do NOT change the replace directive path.**
## Key Interfaces
@ -171,24 +63,6 @@ type Generator interface {
}
```
## External Dependencies
| Package | Purpose |
|---------|---------|
| `github.com/Snider/Borg` | Compression (xz) for archives. **Not** Secure/Blob/Pointer. |
| `github.com/getkin/kin-openapi` | OpenAPI 3.x spec parsing |
| `github.com/oasdiff/oasdiff` | API breaking change detection |
| `github.com/kluctl/go-embed-python` | Embedded Python 3.13 runtime |
| `github.com/spf13/cobra` | CLI framework for build/release commands |
| `golang.org/x/crypto` | SSH connections (ansible/) |
## Configuration Files
- `.core/build.yaml` — Build targets, ldflags, signing, archive format
- `.core/release.yaml` — Version source, changelog style, SDK langs, publisher configs
- `infra.yaml` — Host inventory, DNS zones, cloud provider settings
- `~/.core/config.yaml` — Local dev environment config
## Coding Standards
- **UK English**: colour, organisation, centre
@ -203,6 +77,12 @@ type Generator interface {
- **Repo**: `forge.lthn.ai/core/go-devops`
- **Push via SSH**: `git push forge main` (remote: `ssh://git@forge.lthn.ai:2223/core/go-devops.git`)
## Documentation
- Architecture: `docs/architecture.md`
- Development guide: `docs/development.md`
- Project history: `docs/history.md`
## Task Queue
See `TODO.md` for prioritised work. See `FINDINGS.md` for research notes.
See `TODO.md` for prioritised work.

View file

@ -1,78 +0,0 @@
# FINDINGS.md — go-devops Research & Discovery
## 2026-02-20: Initial Analysis (Virgil)
### Origin
Extracted from `core/go` on 16 Feb 2026 (commit `392ad68`). Single extraction commit — fresh repo.
### Package Inventory
| Package | Files | Source LOC | Test Files | Notes |
|---------|-------|-----------|-----------|-------|
| `ansible/` | 5 | 3,162 | 1 | Playbook executor, SSH, modules, parser |
| `build/` | 6 | 797 | 4 | Project detection, archives, checksums, config |
| `build/builders/` | 6 | 1,390 | — | Go, Wails, Docker, C++, LinuxKit, Taskfile |
| `build/signing/` | 5 | 377 | — | macOS, GPG, Windows signtool |
| `build/buildcmd/` | 6 | 1,053 | — | CLI command handlers |
| `container/` | 5 | 1,208 | 4 | LinuxKit VMs, hypervisor abstraction, state |
| `deploy/python/` | 1 | 147 | — | Embedded Python 3.13 |
| `deploy/coolify/` | 1 | 219 | — | Coolify PaaS API client |
| `devkit/` | 1 | 560 | 1 | Code quality metrics |
| `devops/` | 8 | 1,216 | 8 | Dev environment manager |
| `devops/sources/` | 3 | 218 | — | GitHub/CDN image sources |
| `infra/` | 3 | 953 | 1 | Hetzner, CloudNS, config |
| `release/` | 5 | 1,398 | 5 | Release orchestrator |
| `release/publishers/` | 9 | 2,610 | 9 | 8 target platforms |
| `sdk/` | 3 | 494 | 3 | OpenAPI detection + diff |
| `sdk/generators/` | 5 | 437 | 5 | 4-language SDK gen |
**Total**: ~29K LOC across 71 source files + 47 test files
### Key Observations
1. **ansible/modules.go is the largest file** — 1,434 LOC implementing Ansible modules in pure Go. Zero tests. Highest-priority testing gap.
2. **Borg dependency is compression-only**`github.com/Snider/Borg` used for xz archive creation in `build/archive.go`. Does NOT use the Secure/Blob/Pointer features.
3. **Python 3.13 embedded**`deploy/python/` embeds a full Python runtime via kluctl/go-embed-python. Used exclusively for Coolify API client (Python Swagger). Consider replacing with native Go HTTP client to remove the 50MB+ Python dependency.
4. **DigitalOcean gap** — Was referenced in documentation but no types or implementation existed in code. Removed stale documentation references. No dead types to clean up.
5. **Single-commit repo** — Entire codebase arrived in one `feat: extract` commit. No git history for individual components. This makes blame/bisect impossible for bugs originating before extraction.
6. **Hypervisor platform detection**`container/hypervisor.go` auto-selects QEMU on Linux, Hyperkit on macOS. Both are platform-specific — tests may need build tags or mocking.
7. **CLI via Cobra**`build/buildcmd/` uses Cobra directly (not core/go's CLI framework). May need alignment.
8. **8 release publishers** — GitHub, Docker, Homebrew, npm, AUR, Scoop, Chocolatey, LinuxKit. All implement the `Publisher` interface. Each is ~250-370 LOC. All have test files.
### Test Coverage Gaps
| Package | Gap Severity | Notes |
|---------|-------------|-------|
| `ansible/modules.go` | **Critical** | 1,434 LOC, zero tests |
| `ansible/executor.go` | **Critical** | 1,021 LOC, zero tests |
| `ansible/parser.go` | High | 438 LOC, zero tests |
| `infra/hetzner.go` | High | 381 LOC, zero tests — API calls untested |
| `infra/cloudns.go` | High | 272 LOC, zero tests — DNS ops untested |
| `build/builders/*` | Medium | 1,390 LOC, no individual builder tests |
| `build/signing/*` | Medium | 377 LOC, signing logic untested |
| `deploy/*` | Low | 366 LOC, Python/Coolify integration |
### Integration Points
- **core/go** → Framework (core.E, io.Medium, config, logging)
- **core/go-crypt** → SSH key management (ansible/ssh.go uses golang.org/x/crypto directly, could use go-crypt)
- **core/cli** → Build/release commands registered via Cobra
- **DevOps repo**`infra.yaml` config used by Ansible playbooks in `/Users/snider/Code/DevOps`
### Config File Ecosystem
| File | Location | Purpose |
|------|----------|---------|
| `.core/build.yaml` | Project root | Build targets, signing, archives |
| `.core/release.yaml` | Project root | Version, changelog, publishers |
| `infra.yaml` | Project root | Host inventory, DNS, cloud providers |
| `~/.core/config.yaml` | User home | Local dev environment config |
| `~/.core/state.json` | User home | Container/VM state persistence |

71
TODO.md
View file

@ -1,71 +0,0 @@
# TODO.md — go-devops
Dispatched from core/go orchestration. Pick up tasks in order.
---
## Phase 0: Test Coverage & Hardening
- [x] **Expand ansible/ tests** — Added `parser_test.go` (17 tests: ParsePlaybook, ParseInventory, ParseTasks, GetHosts, GetHostVars, isModule, NormalizeModule), `types_test.go` (RoleRef/Task UnmarshalYAML, Inventory, Facts, TaskResult, KnownModules), `executor_test.go` (getHosts, matchesTags, evaluateWhen, templateString, applyFilter, resolveLoop, templateArgs, handleNotify, normalizeConditions, helper functions). All pass. Commit `6e346cb`.
- [x] **Expand infra/ tests** — Added `hetzner_test.go` (HCloudClient/HRobotClient construction, do() round-trip via httptest, API error handling, JSON serialisation for HCloudServer, HCloudLoadBalancer, HRobotServer) and `cloudns_test.go` (doRaw() round-trip, zone/record JSON, CRUD responses, ACME challenge, auth params, errors). Commit `6e346cb`.
- [x] **Expand build/ tests** — Added `archive_test.go` (archive round-trip for tar.gz/zip, multi-file archives, 249 LOC) and extended `signing_test.go` (mock signer tests, path verification, error handling, +181 LOC). Commit `5d22ed9`.
- [x] **Expand release/ tests** — Fixed nil pointer crash in `linuxkit.go:50` (added `release.FS == nil` guard). Added nil FS test case to `linuxkit_test.go` (+23 LOC). 862 tests pass across build/ and release/. Commit `5d22ed9`.
- [x] **Race condition tests**`go test -race ./...` clean across ansible, infra, container, devops, build packages. Commit `6e346cb`.
- [x] **`go vet ./...` clean** — Fixed stale API calls in container/linuxkit_test.go, state_test.go, templates_test.go, devops/devops_test.go. go.mod replace directive fixed. Commit `6e346cb`.
## Phase 1: Ansible Engine Hardening
### Step 1.0: SSH mock infrastructure
- [x] **Create `ansible/mock_ssh_test.go`** — MockSSHClient with command registry (`expectCommand`), file system simulation (in-memory map), become state tracking, execution/upload logs, and assertion helpers (`hasExecuted`, `hasExecutedMethod`, `findExecuted`). Module shims via `sshRunner` interface for testability. 12 mock infrastructure tests. Commit `3330e55`.
### Step 1.1: Command execution modules (4 modules, ~100 LOC)
- [x] **Test command/shell/raw/script** — 36 module tests + 12 mock tests = 48 new tests. Verifies: command uses `Run()`, shell uses `RunScript()`, raw passes through without wrapping, script reads local file. Cross-module differentiation tests, dispatch routing, template variable resolution. Commit `3330e55`.
### Step 1.2: File operation modules (6 modules, ~280 LOC)
- [x] **Test copy/template/file/lineinfile/blockinfile/stat** — 54 new tests: copy (8), file (12), lineinfile (8), blockinfile (7), stat (5), template (6), dispatch (6), integration (2). Extended mock with `sshFileRunner` interface and 6 module shims. Fixed unsupported module test (copy→hostname). Total ansible tests: 208. Commit `c7da9ad`.
### Step 1.3: Service & package modules (7 modules, ~180 LOC)
- [x] **Test service/systemd/apt/apt_key/apt_repository/package/pip** — 56 new tests: service (12), systemd (4), apt (9), apt_key (6), apt_repository (8), package (3), pip (8), dispatch (7). 7 module shims in mock_ssh_test.go. Commit `9638e77`.
### Step 1.4: User/group & advanced modules (10 modules, ~385 LOC)
- [x] **Test user/group/cron/authorized_key/git/unarchive/uri/ufw/docker_compose** — 69 new tests: user (7), group (7), cron (5), authorized_key (7), git (8), unarchive (8), uri (6), ufw (8), docker_compose (7), dispatch (6). 9 module shims. Total ansible tests: 334. Commit `427929f`.
### Step 1.5: Error propagation & become
- [x] **Error propagation** — 68 tests: getHosts (8), matchesTags (7), evaluateWhen/evalCondition (22), templateString (14), applyFilter (9), resolveLoop (5), handleNotify (5), normalizeConditions (6), cross-cutting (7).
- [x] **Become/sudo** — 8 tests: enable/disable cycle, default user, passwordless, play-level become.
- [x] **Fact gathering** — 9 tests: Ubuntu/CentOS/Alpine/Debian os-release parsing, hostname, localhost.
- [x] **Idempotency checks** — 8 tests: group exists, authorized_key present, docker compose up-to-date, stat always unchanged.
- Total ansible tests: 438. Phase 1 complete. Commit `8ab8643`.
## Phase 2: Infrastructure API Robustness
- [x] **API client abstraction** — Extracted shared `APIClient` struct in `infra/client.go` with `Do()` (JSON) and `DoRaw()` (bytes) methods. `HCloudClient`, `HRobotClient`, and `CloudNSClient` now delegate to `APIClient` via configurable auth functions and error prefixes. Options: `WithHTTPClient`, `WithRetry`, `WithAuth`, `WithPrefix`. 30 new `client_test.go` tests.
- [x] **Retry logic**`APIClient` implements configurable exponential backoff with jitter. Retries on 5xx and transport errors; does NOT retry 4xx (except 429). `RetryConfig{MaxRetries, InitialBackoff, MaxBackoff}` with `DefaultRetryConfig()` (3 retries, 100ms, 5s). Context cancellation respected during backoff.
- [x] **Rate limiting** — Detects 429 responses, parses `Retry-After` header (seconds format, falls back to 1s). Queues subsequent requests behind a shared `blockedUntil` mutex. Rate limit wait is per-`APIClient` instance. Tested with real 1s Retry-After delays.
- [x] **DigitalOcean support** — Investigated: no types or implementation existed in `infra/` code, only stale documentation references in CLAUDE.md and FINDINGS.md. Removed the dead references. No code changes needed.
## Phase 3: Release Pipeline Testing
- [x] **Publisher integration tests** — Added `integration_test.go` (48 tests): GitHub dry-run/command-building/repo-detection/artifact-upload, Docker dry-run/buildx-args/config-parsing, Homebrew dry-run/formula-generation/class-naming, Scoop dry-run/manifest, AUR dry-run/PKGBUILD/SRCINFO, Chocolatey dry-run/nuspec, npm dry-run/package.json, LinuxKit dry-run multi-format/platform. Cross-publisher: name uniqueness, nil relCfg, checksum mapping, interface compliance. Commit `032d862`.
- [x] **SDK generation tests** — Added `generation_test.go` (38 tests): SDK orchestration (Generate iterates languages, output directory, no-spec error, unknown language), generator registry (register/get/overwrite/languages), generator interface compliance (language identifiers, install instructions, Available safety), SDK config (defaults, SetVersion, nil config), spec detection priority (configured > common > Scramble, all 8 common paths). Commit `032d862`.
- [x] **Breaking change detection** — Added `breaking_test.go` (30 tests): oasdiff integration for add-endpoint (non-breaking), remove-endpoint (breaking), add-required-param (breaking), add-optional-param (non-breaking), change-response-type (breaking), remove-HTTP-method (breaking), identical-specs, multiple-breaking-changes, JSON spec support. Error handling: non-existent base/revision, invalid YAML. DiffExitCode (0/1/2), DiffResult summary/human-readable changes. Commit `032d862`.
## Phase 4: DevKit Expansion
- [x] **Vulnerability scanning**`VulnCheck()` runs `govulncheck -json` and `ParseVulnCheckJSON()` parses newline-delimited JSON into `VulnFinding` structs (ID, aliases, package, called function, description, fixed version, module path). Handles malformed lines, missing OSV entries, empty traces. 13 tests in `vulncheck_test.go`. Commit `e20083d`.
- [x] **Complexity thresholds**`AnalyseComplexity()` walks Go source via `go/ast` with configurable threshold (default 15). Counts: if, for, range, case, comm, &&, ||, type switch, select. Skips vendor/, test files, hidden dirs. `AnalyseComplexitySource()` for in-memory parsing. 21 tests in `complexity_test.go`. Commit `e20083d`.
- [x] **Coverage trending**`ParseCoverProfile()` parses coverprofile format, `ParseCoverOutput()` parses human-readable `go test -cover` output. `CoverageStore` with JSON persistence (Append/Load/Latest). `CompareCoverage()` diffs snapshots, flags regressions/improvements/new/removed packages. 19 tests in `coverage_test.go`. Commit `e20083d`.
---
## Workflow
1. Virgil in core/go writes tasks here after research
2. This repo's dedicated session picks up tasks in phase order
3. Mark `[x]` when done, note commit hash

482
docs/architecture.md Normal file
View file

@ -0,0 +1,482 @@
# Architecture — go-devops
`forge.lthn.ai/core/go-devops` is an infrastructure and build automation library written in Go. It provides native Go implementations of Ansible playbook execution, multi-target build pipelines, release automation, infrastructure API clients, container/VM management, SDK generation, and a developer toolkit with static analysis capabilities. The library is approximately 29,000 lines of source across 71 source files.
## Package Map
```
go-devops/
├── ansible/ Ansible playbook execution engine (native Go, no shell-out)
├── build/ Build system: project detection, archives, checksums
│ ├── builders/ Plugin implementations: Go, Wails, Docker, C++, LinuxKit, Taskfile
│ ├── signing/ Code signing: macOS codesign, GPG, Windows signtool
│ └── buildcmd/ Cobra CLI handlers for core build / core release
├── container/ LinuxKit VM management, hypervisor abstraction, state
├── deploy/ Deployment integrations
│ ├── python/ Embedded Python 3.13 runtime
│ └── coolify/ Coolify PaaS API client (via Python Swagger)
├── devkit/ Developer toolkit: quality metrics, security, coverage trending
├── devops/ Portable dev environment management
│ └── sources/ Image download sources: GitHub Releases, S3/CDN
├── infra/ Infrastructure APIs: Hetzner Cloud, Hetzner Robot, CloudNS
├── release/ Release orchestration: version, changelog, publishing
│ └── publishers/ Platform publishers: GitHub, Docker, Homebrew, npm, AUR, Scoop, Chocolatey, LinuxKit
└── sdk/ OpenAPI SDK generation and breaking-change detection
└── generators/ Language generators: TypeScript, Python, Go, PHP
```
## Core Interfaces
Every extensible subsystem is defined by a small interface.
```go
// build/builders — project type plugin
type Builder interface {
Name() string
Detect(fs io.Medium, dir string) (bool, error)
Build(ctx context.Context, cfg *Config, targets []Target) ([]Artifact, error)
}
// release/publishers — distribution target plugin
type Publisher interface {
Name() string
Publish(ctx context.Context, release *Release, pubCfg PublisherConfig, relCfg ReleaseConfig, dryRun bool) error
}
// container — hypervisor abstraction
type Hypervisor interface {
Name() string
Available() bool
Run(ctx context.Context, opts RunOptions) (*process.Handle, error)
}
// devops/sources — image download plugin
type ImageSource interface {
Name() string
Available() bool
Download(ctx context.Context, name, version string, progress func(downloaded, total int64)) (string, error)
}
// build/signing — code signing plugin
type Signer interface {
Name() string
Available() bool
Sign(filePath, keyID string) ([]byte, error)
}
// sdk/generators — language SDK generator
type Generator interface {
Language() string
Generate(ctx context.Context, spec, outputDir string, config *Config) error
}
```
---
## Ansible Integration
### Overview
The `ansible/` package executes Ansible playbooks natively in Go without shelling out to `ansible-playbook`. It implements the Ansible execution model — facts, handlers, `register`, `when`, `loop`, `become` — over SSH connections managed with `golang.org/x/crypto/ssh`.
### Package Files
| File | LOC | Responsibility |
|------|-----|---------------|
| `executor.go` | 1,021 | Playbook runner: task dispatch, handler firing, fact injection, become/sudo |
| `modules.go` | 1,434 | Module implementations: ~30 Ansible modules in pure Go |
| `parser.go` | 438 | YAML playbook + inventory parser |
| `ssh.go` | 451 | SSH client with persistent connection and file upload |
| `types.go` | 258 | Core types: Play, Task, Handler, Inventory, Facts |
### Data Model
A `Playbook` contains one or more `Play` structs. Each play targets a host pattern from the `Inventory` and runs a list of `Task` structs. Tasks carry a `Module` name (derived from the YAML key), `Args` map, and optional control fields (`when`, `register`, `loop`, `notify`, `become`).
```
Playbook
└── []Play
├── Hosts string (host pattern: "webservers", "all")
├── Become bool
├── Vars map[string]any
├── PreTasks []Task
├── Tasks []Task
├── PostTasks []Task
├── Handlers []Task
└── Roles []RoleRef
```
`TaskResult` carries the Ansible result contract: `Changed`, `Failed`, `Skipped`, `Stdout`, `Stderr`, `RC`, and a `Data` map for module-specific output.
### Execution Model
1. Parser reads a YAML playbook file and builds the `Playbook` struct.
2. The inventory is parsed from a separate YAML file (or from the play's `vars`).
3. For each play, the executor resolves target hosts from the `Inventory`.
4. If `gather_facts` is enabled, the executor SSHs to each host, reads `/etc/os-release`, and populates a `Facts` struct.
5. Tasks are executed in order. For each task:
- `when` conditionals are evaluated using Go template logic and registered variables.
- `loop` items are resolved and the module is called once per item.
- The module function is dispatched via a string-keyed registry that normalises both long (`ansible.builtin.shell`) and short (`shell`) module names.
- `register` stores the `TaskResult` in a variable map for subsequent `when` and template evaluations.
- `notify` queues handler names; handlers fire once at the end of the play if any task triggered them.
6. `become: true` prefixes commands with `sudo -u <become_user>`.
### Implemented Modules
Modules are grouped by category:
- **Command execution**: `command`, `shell`, `raw`, `script`
- **File operations**: `copy`, `template`, `file`, `lineinfile`, `blockinfile`, `stat`
- **Package management**: `apt`, `apt_key`, `apt_repository`, `yum`, `dnf`, `package`, `pip`
- **Service management**: `service`, `systemd`
- **User and group**: `user`, `group`, `authorized_key`, `cron`
- **Source control**: `git`, `unarchive`
- **Network**: `uri`, `get_url`
- **Firewall**: `ufw`
- **Container**: `docker_compose`
- **Control flow**: `debug`, `fail`, `assert`, `set_fact`, `include_vars`, `wait_for`, `pause`, `meta`
### SSH Layer
`ssh.go` manages a persistent `*ssh.Client` per host. It exposes three operations:
- `Run(cmd string) (stdout, stderr string, rc int, err error)` — executes a command
- `RunScript(script string) (...)` — uploads a temporary script and runs it
- `Upload(localPath, remotePath string) error` — SCP-style file upload via SFTP subsystem
Connection parameters (host, port, user, private key file) are drawn from the `Host` struct in the inventory.
---
## Build Pipeline
### Overview
The `build/` package provides project-type detection and cross-compilation. Configuration is read from `.core/build.yaml`. The `buildcmd/` sub-package registers Cobra commands (`core build`, `core build pwa`, `core build sdk`, `core release`) into the main CLI.
### Project Detection
`discovery.go` probes marker files in priority order:
| Marker file | Project type |
|-------------|-------------|
| `wails.json` | `wails` |
| `go.mod` | `go` |
| `package.json` | `node` |
| `composer.json` | `php` |
| `CMakeLists.txt` | `cpp` |
| `Dockerfile` | `docker` |
| `*.yml` (linuxkit pattern) | `linuxkit` |
| `Taskfile.yml` | `taskfile` |
### Build Targets and Artifacts
A `Target` carries `OS` and `Arch` (matching `GOOS`/`GOARCH`). Each builder produces `[]Artifact`, where each artifact has a file `Path`, `OS`, `Arch`, and a SHA-256 `Checksum`. Checksums are computed and stored in `dist/*.sha256` files.
### Archive Creation
`archive.go` packages build outputs using `github.com/Snider/Borg` for xz compression. Supported formats: `tar.gz`, `tar.xz`, `zip`. The Borg dependency is used only for xz support; it does not use the Secure/Blob or Secure/Pointer features.
### Builder Plugins
Each builder implements `Builder.Detect()` to self-identify for a directory and `Builder.Build()` to produce artifacts.
| Builder | Notes |
|---------|-------|
| `go.go` | `go build` with ldflags injection, cross-compilation via `GOOS`/`GOARCH` env |
| `wails.go` | Wails v3 desktop app build, platform-specific packaging |
| `docker.go` | `docker buildx` with multi-platform support, optional push |
| `cpp.go` | CMake configure + build in a temp directory |
| `linuxkit.go` | LinuxKit YAML config → multi-format VM images (iso, qcow2, raw, vmdk) |
| `taskfile.go` | Delegates to `task` CLI with target mapping |
### Code Signing
`signing/` implements a `Signer` interface with three backends:
| Signer | Platform | Tool |
|--------|----------|------|
| macOS | darwin | `codesign` |
| GPG | any | `gpg --detach-sign` |
| Windows | windows | `signtool` |
`Available()` checks whether the required tool exists at runtime. Signing is applied to binary artifacts after build, before archiving.
---
## Infrastructure Management
### Overview
The `infra/` package provides typed API clients for Hetzner Cloud (VPS), Hetzner Robot (bare metal), and CloudNS DNS. All three share a common `APIClient` with exponential backoff, rate-limit handling, and configurable authentication.
### Shared API Client
`client.go` defines `APIClient`:
```go
type APIClient struct {
client *http.Client
retry RetryConfig
authFn func(req *http.Request)
prefix string
mu sync.Mutex
blockedUntil time.Time
}
```
- `Do(req, result)` — executes a request with JSON decoding.
- `DoRaw(req)` — executes a request and returns raw bytes.
- Both methods apply: auth injection, rate-limit window respect, exponential backoff with jitter on 5xx and transport errors. 4xx errors (except 429) are not retried.
**Retry configuration** (`RetryConfig`):
| Field | Default |
|-------|---------|
| `MaxRetries` | 3 |
| `InitialBackoff` | 100 ms |
| `MaxBackoff` | 5 s |
**Rate limiting**: on HTTP 429, the `Retry-After` header (seconds format) is parsed and a `blockedUntil` timestamp is set. All subsequent requests on the same `APIClient` instance wait until that timestamp before proceeding. Context cancellation is honoured during the wait.
### Hetzner Cloud Client
`hetzner.go` wraps the Hetzner Cloud v1 API (Bearer token auth). Supported resources: servers, load balancers, networks, volumes, SSH keys, firewalls. The Hetzner Robot client (Basic Auth) supports bare metal servers.
### CloudNS Client
`cloudns.go` wraps the CloudNS API v1 (auth-id + auth-password query parameters). Supports: zone listing, record CRUD, ACME DNS-01 challenge records. Used by the `infra` provisioning pipeline for automatic TLS certificate issuance via Let's Encrypt.
### Infrastructure Configuration
`infra.yaml` (project root) defines the full host inventory and cloud resources:
```yaml
hosts:
- name: de1
ip: 1.2.3.4
provider: hetzner-robot
roles: [web, db]
dns:
provider: cloudns
zones: [lthn.ai, leth.in]
loadbalancers:
- name: lb-de1
provider: hetzner-cloud
```
The `config.go` file in `infra/` parses this into typed structs: `Host`, `LoadBalancer`, `Network`, `DNSZone`, `Database`, `Cache`.
---
## Release Workflow
### Overview
The `release/` package orchestrates the full release pipeline: version detection, changelog generation from git history, and publishing to multiple distribution targets. Configuration is read from `.core/release.yaml`.
### Version Detection
`DetermineVersion(dir string)` checks in priority order:
1. Git tag on `HEAD` (exact match).
2. Most recent tag with patch increment (`IncrementVersion`).
3. Default `v0.0.1` if no tags exist.
`IncrementVersion` parses semver and increments the patch component, stripping any pre-release suffix.
### Changelog Generation
`changelog.go` (`Generate` function) reads git log since the previous tag and formats commits into a markdown changelog. Conventional commit prefixes (`feat:`, `fix:`, `refactor:`, etc.) are parsed to group entries.
### Release Orchestration
`Publish(ctx, cfg, dryRun)`:
1. Resolves version.
2. Scans `dist/` for pre-built artifacts (built by `core build`).
3. Generates changelog.
4. Iterates configured publishers, calling `Publisher.Publish()` on each.
5. Returns a `*Release` struct with version, artifacts, and changelog.
The separation of `core build` and `core release` allows CI pipelines to build once and publish to multiple targets independently.
### Publishers
All publishers implement `Publisher.Publish(ctx, release, pubCfg, relCfg, dryRun)`. When `dryRun` is true, publishers log what they would do without making external calls.
| Publisher | Distribution method |
|-----------|-------------------|
| `github.go` | GitHub Releases API — creates release, uploads artifact files |
| `docker.go` | `docker buildx build --push` to configured registry |
| `homebrew.go` | Generates a Ruby formula file, commits to a tap repository |
| `npm.go` | `npm publish` to the npm registry |
| `aur.go` | Generates `PKGBUILD` + `.SRCINFO`, pushes to AUR git remote |
| `scoop.go` | Generates a JSON manifest, commits to a Scoop bucket |
| `chocolatey.go` | Generates `.nuspec`, calls `choco push` |
| `linuxkit.go` | Builds and uploads LinuxKit multi-format VM images |
---
## Container and VM Management
### Overview
The `container/` package manages LinuxKit-based VM images. It abstracts hypervisor differences (QEMU on Linux, Hyperkit on macOS) and persists container state to `~/.core/state.json`.
### Hypervisor Abstraction
`hypervisor.go` auto-selects the backend at runtime:
- `Available()` checks for the hypervisor binary in `PATH`.
- Linux: QEMU (`qemu-system-x86_64` / `qemu-system-aarch64`).
- macOS: Hyperkit (`hyperkit`).
### State Persistence
`state.go` serialises container records to `~/.core/state.json`. Each record includes the container name, LinuxKit image path, hypervisor PID, and network configuration.
### Template Rendering
`templates.go` renders Packer and LinuxKit YAML templates using Go `text/template`, substituting image name, kernel version, architecture, and port mappings.
---
## DevKit — Developer Toolkit
### Overview
The `devkit/` package provides code quality, security, and metrics functions exposed as a `Toolkit` struct. All methods operate on a working directory passed to `New(dir)`.
### Code Quality
| Function | Description |
|----------|-------------|
| `FindTODOs(dir)` | Uses `git grep` to locate `TODO`, `FIXME`, `HACK` comments |
| `Lint(pkg)` | Runs `go vet` and parses findings |
| `TestCount(pkg)` | Lists test functions via `go test -list` |
| `Coverage(pkg)` | Runs `go test -cover` and parses per-package percentages |
| `RaceDetect(pkg)` | Runs `go test -race` and extracts `DATA RACE` reports |
| `Build(targets...)` | Compiles targets, returns `BuildResult` with any errors |
| `ModTidy()` | Runs `go mod tidy` |
### Security
| Function | Description |
|----------|-------------|
| `AuditDeps()` | Runs `govulncheck ./...` (human-readable), parses `Vulnerability #` blocks |
| `VulnCheck(modulePath)` | Runs `govulncheck -json`, parses newline-delimited JSON into `VulnFinding` structs |
| `ScanSecrets(dir)` | Runs `gitleaks detect --report-format csv`, parses CSV output |
| `CheckPerms(dir)` | Walks directory tree, flags world-writable files |
### Vulnerability Scanning Detail
`VulnCheck` produces structured `VulnFinding` values:
```go
type VulnFinding struct {
ID string // GO-2024-xxxx
Aliases []string // CVE/GHSA identifiers
Package string // Affected package path
CalledFunction string // Function in call stack
Description string // OSV summary
FixedVersion string // Minimum fixed version
ModulePath string // Go module path
}
```
`ParseVulnCheckJSON` correlates `finding` messages with `osv` metadata messages from govulncheck's JSON stream. It skips malformed lines gracefully (govulncheck occasionally emits non-JSON progress lines).
### Cyclomatic Complexity Analysis
`AnalyseComplexity(cfg ComplexityConfig)` walks Go source files using `go/ast` without external tools:
- Default threshold: 15.
- Skips `vendor/`, hidden directories, and `_test.go` files.
- Counts branching constructs: `if`, `for`, `range`, `case` (non-default), `select` comm clause, `&&`, `||`, type switch, select statement.
- Returns `[]ComplexityResult` with function name, package, file, line, and score.
`AnalyseComplexitySource(src, filename, threshold)` accepts source as a string for in-memory analysis.
### Coverage Trending
Three complementary functions handle coverage over time:
- `ParseCoverProfile(data)` — parses `go test -coverprofile` format; computes per-package statement ratios.
- `ParseCoverOutput(output)` — parses human-readable `go test -cover ./...` output.
- `CompareCoverage(previous, current)` — diffs two `CoverageSnapshot` values, returning regressions, improvements, new packages, and removed packages.
`CoverageStore` persists snapshots as a JSON array at a configurable file path, with `Append`, `Load`, and `Latest` methods.
### Git and Metrics
| Function | Description |
|----------|-------------|
| `DiffStat()` | Parses `git diff --stat` summary |
| `UncommittedFiles()` | Lists files with uncommitted changes via `git status --porcelain` |
| `GitLog(n)` | Returns last `n` commits as structured `Commit` values |
| `DepGraph(pkg)` | Parses `go mod graph` into a `Graph{Nodes, Edges}` |
| `Complexity(threshold)` | Wraps external `gocyclo` tool (distinct from `AnalyseComplexity` which uses `go/ast`) |
---
## SDK Generation
### Overview
The `sdk/` package auto-detects an OpenAPI specification, generates typed client libraries in up to four languages, and detects breaking changes between spec versions using oasdiff.
### Spec Detection
`DetectSpec(dir)` checks locations in priority order:
1. Path configured in `.core/release.yaml`.
2. Common paths: `openapi.yaml`, `openapi.json`, `api/openapi.yaml`, `docs/openapi.yaml`, and four others.
### Language Generators
Each generator implements the `Generator` interface. Supported languages: TypeScript, Python, Go, PHP. Generator registration uses a string-keyed registry allowing overrides.
### Breaking Change Detection
`DetectBreakingChanges(baseSpec, revisionSpec)` uses `github.com/oasdiff/oasdiff` to compare two spec files. Returns a `DiffResult` with a human-readable summary and a list of individual breaking changes. Exit codes: 0 = no changes, 1 = non-breaking changes, 2 = breaking changes.
---
## Configuration Files
| File | Location | Purpose |
|------|----------|---------|
| `.core/build.yaml` | Project root | Build targets, ldflags, signing, archive format |
| `.core/release.yaml` | Project root | Version source, changelog style, SDK languages, publisher configs |
| `infra.yaml` | Project root | Host inventory, DNS zones, cloud provider credentials |
| `~/.core/config.yaml` | User home | Local dev environment configuration |
| `~/.core/state.json` | User home | Container/VM runtime state |
---
## External Dependencies
| Package | Purpose |
|---------|---------|
| `github.com/Snider/Borg` | xz compression for build archives |
| `github.com/getkin/kin-openapi` | OpenAPI 3.x spec parsing |
| `github.com/oasdiff/oasdiff` | API breaking change detection |
| `github.com/kluctl/go-embed-python` | Embedded Python 3.13 runtime for Coolify client |
| `github.com/spf13/cobra` | CLI framework for `build/buildcmd/` |
| `golang.org/x/crypto` | SSH connections in `ansible/ssh.go` |
| `gopkg.in/yaml.v3` | Playbook and config YAML parsing |
| `github.com/stretchr/testify` | Test assertions |
## Dependency on `forge.lthn.ai/core/go`
The parent framework is referenced via a `replace` directive in `go.mod`:
```
replace forge.lthn.ai/core/go => ../core
```
Provides: `core.E` (contextual errors), `io.Medium` (file system abstraction), config, logging, and i18n utilities.

304
docs/development.md Normal file
View file

@ -0,0 +1,304 @@
# Development Guide — go-devops
## Prerequisites
| Tool | Minimum version | Purpose |
|------|----------------|---------|
| Go | 1.25 | Build and test |
| Task | any | Taskfile automation (optional, used by some builders) |
| `govulncheck` | latest | Vulnerability scanning (`devkit.VulnCheck`) |
| `gitleaks` | any | Secret scanning (`devkit.ScanSecrets`) |
| `gocyclo` | any | External complexity tool (`devkit.Complexity`) |
| SSH access | — | Integration tests for `ansible/` package |
Install optional tools:
```bash
go install golang.org/x/vuln/cmd/govulncheck@latest
go install github.com/zricethezav/gitleaks/v8@latest
go install github.com/fzipp/gocyclo/cmd/gocyclo@latest
```
## Local Dependency
`go-devops` depends on `forge.lthn.ai/core/go` (the parent framework). The `go.mod` `replace` directive resolves this locally:
```
replace forge.lthn.ai/core/go => ../core
```
The `../core` path must exist relative to the `go-devops` checkout. If working in a Go workspace (`go.work`), add both modules:
```
go work init
go work use . ../core
```
Do not alter the `replace` directive path.
## Build and Test
```bash
# Run all tests
go test ./...
# Run all tests with race detector
go test -race ./...
# Run a single test by name
go test -v -run TestName ./...
# Run tests in one package
go test ./ansible/...
# Static analysis
go vet ./...
# Check for vulnerabilities
govulncheck ./...
# View test coverage
go test -cover ./...
# Generate a coverage profile
go test -coverprofile=cover.out ./...
go tool cover -html=cover.out
```
## Test Patterns
### Naming Convention
Tests use `_Good`, `_Bad`, and `_Ugly` suffixes:
| Suffix | Meaning |
|--------|---------|
| `_Good` | Happy-path test; expected success |
| `_Bad` | Expected error condition; error must be returned |
| `_Ugly` | Panic, edge case, or degenerate input |
Example:
```go
func TestParsePlaybook_Good(t *testing.T) { ... }
func TestParsePlaybook_Bad(t *testing.T) { ... }
func TestParsePlaybook_Ugly(t *testing.T) { ... }
```
### Assertion Library
Use `github.com/stretchr/testify`. Prefer `require` over `assert` when subsequent assertions depend on the previous one passing:
```go
import (
"testing"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func TestSomething_Good(t *testing.T) {
result, err := SomeFunction()
require.NoError(t, err)
assert.Equal(t, "expected", result.Field)
}
```
### HTTP Test Servers
Use `net/http/httptest` for API client tests. The `infra/` tests demonstrate the pattern:
```go
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(http.StatusOK)
_, _ = w.Write([]byte(`{"id": 1}`))
}))
defer srv.Close()
client := NewHCloudClient("token", WithHTTPClient(srv.Client()))
```
### SSH Mocking
The `ansible/` package uses an `sshRunner` interface to decouple module implementations from real SSH connections. `mock_ssh_test.go` provides `MockSSHClient` with:
- `expectCommand(pattern, stdout, stderr, rc)` — registers expected command patterns.
- `hasExecuted(pattern)` — asserts a command matching the pattern was called.
- `hasExecutedMethod(method)` — asserts a specific method (`Run`, `RunScript`, `Upload`) was called.
- In-memory filesystem simulation for file operation tests.
Use `MockSSHClient` for all `ansible/modules.go` tests. Real SSH connections are not used in unit tests.
### In-Memory Complexity Analysis
For `devkit` complexity tests, use `AnalyseComplexitySource` rather than writing temporary files:
```go
src := `package foo
func Complex(x int) int {
if x > 0 { return x }
return -x
}`
results, err := AnalyseComplexitySource(src, "foo.go", 1)
require.NoError(t, err)
```
### Coverage Store Tests
Use `t.TempDir()` to create temporary directories for `CoverageStore` persistence tests:
```go
dir := t.TempDir()
store := NewCoverageStore(filepath.Join(dir, "coverage.json"))
```
### Publisher Dry-Run Tests
All `release/publishers/` tests use `dryRun: true`. No external services are called. Tests verify:
- Correct command-line argument construction.
- Correct file generation (formula text, manifest JSON, PKGBUILD content).
- Interface compliance: the publisher's `Name()` is non-empty and `Publish` with a nil config does not panic.
---
## Coding Standards
### Language
Use **UK English** in all documentation, comments, identifiers, log messages, and error strings:
- colour (not color)
- organisation (not organization)
- centre (not center)
- behaviour (not behavior)
- licence (noun, not license)
### Strict Types
Every Go file must use strict typing. Avoid `any` at API boundaries where a concrete type is knowable. `map[string]any` is acceptable for Ansible task arguments and YAML-decoded data where the schema is dynamic.
### Error Handling
Use the `core.E` helper from `forge.lthn.ai/core/go` for contextual errors:
```go
return core.E("ansible.Executor.runTask", "failed to upload file", err)
```
For packages that do not import `core/go`, use `fmt.Errorf` with `%w`:
```go
return fmt.Errorf("infra.HCloudClient.ListServers: %w", err)
```
Error strings must not be capitalised and must not end with punctuation (Go convention).
### Import Order
Three groups, each separated by a blank line:
1. Standard library
2. `forge.lthn.ai/core/...` packages
3. Third-party packages
```go
import (
"context"
"fmt"
"forge.lthn.ai/core/go/pkg/io"
"gopkg.in/yaml.v3"
"golang.org/x/crypto/ssh"
)
```
### File Headers
Source files do not require a licence header comment beyond the package declaration. The `devkit/` package uses a trailing `// LEK-1 | lthn.ai | EUPL-1.2` comment; maintain this convention in `devkit/` files only.
### Interface Placement
Define interfaces in the package that consumes them, not the package that implements them. The `Builder`, `Publisher`, `Signer`, `Generator`, `Hypervisor`, and `ImageSource` interfaces each live in the package that calls them.
---
## Conventional Commits
All commits follow the Conventional Commits specification.
**Format**: `type(scope): description`
**Scopes** map to package names:
| Scope | Package |
|-------|---------|
| `ansible` | `ansible/` |
| `build` | `build/`, `build/builders/`, `build/signing/` |
| `container` | `container/` |
| `devkit` | `devkit/` |
| `devops` | `devops/` |
| `infra` | `infra/` |
| `release` | `release/`, `release/publishers/` |
| `sdk` | `sdk/`, `sdk/generators/` |
| `deploy` | `deploy/` |
**Examples**:
```
feat(ansible): add docker_compose module support
fix(infra): handle nil Retry-After header in rate limiter
refactor(build): extract archive creation into separate function
test(devkit): expand coverage trending snapshot comparison tests
chore: update go.sum after dependency upgrade
```
**Co-author line**: every commit must include:
```
Co-Authored-By: Virgil <virgil@lethean.io>
```
---
## Licence
All source files are licensed under the **European Union Public Licence 1.2 (EUPL-1.2)**. Do not introduce dependencies with licences incompatible with EUPL-1.2. The `github.com/kluctl/go-embed-python` dependency (Apache 2.0) and `golang.org/x/crypto` (BSD-3-Clause) are compatible. Verify new dependencies before adding them.
---
## Forge Repository
- **Remote**: `ssh://git@forge.lthn.ai:2223/core/go-devops.git`
- **Push**: `git push forge main`
- HTTPS authentication is not supported on the Forge instance; SSH is required.
---
## Adding a New Module to ansible/
1. Add the module name(s) to `KnownModules` in `types.go`.
2. Implement a function `executeModuleName(ctx, ssh, args, vars) TaskResult` in `modules.go`.
3. Add a `case "modulename":` branch in the dispatch switch in `executor.go`.
4. Add a shim to `mock_ssh_test.go`'s `sshRunner` interface (if the module requires file operations, use `sshFileRunner`).
5. Write tests in `modules_*_test.go` using the mock infrastructure. Cover at minimum: success case, changed vs. unchanged, argument validation failure, and SSH error propagation.
## Adding a New Release Publisher
1. Create `release/publishers/myplatform.go`.
2. Implement `Publisher`:
- `Name() string` — return the platform name.
- `Publish(ctx, release, pubCfg, relCfg, dryRun) error` — when `dryRun` is true, log intent and return nil.
3. Register the publisher in `release/config.go` alongside existing publishers.
4. Write `release/publishers/myplatform_test.go` with dry-run tests. Follow the pattern of existing publisher tests: verify command arguments, generated file content, and interface compliance.
## Adding a New Builder
1. Create `build/builders/mylang.go`.
2. Implement `Builder`:
- `Name() string`
- `Detect(fs io.Medium, dir string) (bool, error)` — check for a marker file.
- `Build(ctx, cfg, targets) ([]Artifact, error)`
3. Register the builder in `build/buildcmd/`.
4. Write tests verifying `Detect` (marker present/absent) and `Build` (at minimum with a mock `io.Medium`).

203
docs/history.md Normal file
View file

@ -0,0 +1,203 @@
# Project History — go-devops
## Origin
`go-devops` was extracted from the `forge.lthn.ai/core/go` monorepo on 16 February 2026. The entire codebase arrived in a single extraction commit and was pushed to its own Forge repository (`core/go-devops`). This means git blame and bisect cannot distinguish the history of individual components prior to the extraction date; all pre-extraction bugs are outside the revision graph.
**Extraction commit**: the repository's first commit (`feat: extract`) contains the full initial codebase — approximately 29,000 lines across 71 source files and 47 test files spanning 16 packages.
---
## Phase 0: Test Coverage and Hardening
**Commit**: `6e346cb`, `5d22ed9`
**Scope**: Established a baseline test suite across the packages with the most critical coverage gaps at extraction.
### Completed work
- **ansible/ tests** — Added `parser_test.go` (17 tests covering `ParsePlaybook`, `ParseInventory`, `ParseTasks`, `GetHosts`, `GetHostVars`, `isModule`, `NormalizeModule`), `types_test.go` (covering `RoleRef`/`Task` `UnmarshalYAML`, `Inventory`, `Facts`, `TaskResult`, `KnownModules`), and `executor_test.go` (covering `getHosts`, `matchesTags`, `evaluateWhen`, `templateString`, `applyFilter`, `resolveLoop`, `templateArgs`, `handleNotify`, `normalizeConditions`, and helper functions).
- **infra/ tests** — Added `hetzner_test.go` (covering `HCloudClient`/`HRobotClient` construction, `do()` round-trip via `httptest`, API error handling, and JSON serialisation for `HCloudServer`, `HCloudLoadBalancer`, `HRobotServer`) and `cloudns_test.go` (covering `doRaw()` round-trip, zone/record JSON, CRUD responses, ACME challenge, auth parameters, and errors).
- **build/ tests** — Added `archive_test.go` (249 LOC, archive round-trip for tar.gz and zip, multi-file archives) and extended `signing_test.go` (+181 LOC with mock signer tests, path verification, and error handling).
- **release/ nil guard** — Fixed a nil pointer crash in `release/publishers/linuxkit.go` line 50. Added a `release.FS == nil` guard. Added a corresponding nil FS test case to `linuxkit_test.go` (+23 LOC). Total test count across build/ and release/ reached 862.
- **Race detector**`go test -race ./...` confirmed clean across `ansible/`, `infra/`, `container/`, `devops/`, and `build/` packages.
- **`go vet ./...`** — Fixed stale API calls in `container/linuxkit_test.go`, `state_test.go`, `templates_test.go`, and `devops/devops_test.go`. Fixed the `go.mod` `replace` directive.
---
## Phase 1: Ansible Engine Hardening
**Commits**: `3330e55`, `c7da9ad`, `9638e77`, `427929f`, `8ab8643`
**Scope**: Brought the Ansible engine from zero test coverage to comprehensive coverage across all module categories, SSH infrastructure, and executor logic.
### Step 1.0: SSH mock infrastructure (`3330e55`)
Created `ansible/mock_ssh_test.go` providing:
- `MockSSHClient` with a command registry (`expectCommand`), in-memory filesystem, become-state tracking, and execution/upload logs.
- Assertion helpers: `hasExecuted`, `hasExecutedMethod`, `findExecuted`.
- Module shims via the `sshRunner` interface to decouple module functions from real SSH connections.
- 12 mock infrastructure tests confirming the mock behaves correctly in isolation.
### Step 1.1: Command execution modules (`3330e55`)
36 module tests covering `command`, `shell`, `raw`, and `script`. Verified: `command` uses `Run()`, `shell` uses `RunScript()`, `raw` passes through unmodified, `script` reads a local file before uploading. Cross-module differentiation and dispatch routing tests included. Total ansible tests at this point: 48.
### Step 1.2: File operation modules (`c7da9ad`)
54 new tests across `copy` (8), `file` (12), `lineinfile` (8), `blockinfile` (7), `stat` (5), `template` (6), dispatch (6), and integration (2). Extended the mock with an `sshFileRunner` interface and 6 module shims. Fixed an unsupported-module test (copy to hostname). Total ansible tests: 208.
### Step 1.3: Service and package modules (`9638e77`)
56 new tests across `service` (12), `systemd` (4), `apt` (9), `apt_key` (6), `apt_repository` (8), `package` (3), `pip` (8), and dispatch (7). 7 new module shims added to `mock_ssh_test.go`.
### Step 1.4: User, group, and advanced modules (`427929f`)
69 new tests across `user` (7), `group` (7), `cron` (5), `authorized_key` (7), `git` (8), `unarchive` (8), `uri` (6), `ufw` (8), `docker_compose` (7), and dispatch (6). 9 module shims. Total ansible tests: 334.
### Step 1.5: Error propagation, become, facts, idempotency (`8ab8643`)
- **Error propagation** — 68 tests across `getHosts`, `matchesTags`, `evaluateWhen`/`evalCondition`, `templateString`, `applyFilter`, `resolveLoop`, `handleNotify`, `normalizeConditions`, and cross-cutting scenarios.
- **Become/sudo** — 8 tests: enable/disable cycle, default user, passwordless sudo, play-level become.
- **Fact gathering** — 9 tests: Ubuntu, CentOS, Alpine, and Debian `os-release` parsing, hostname, and localhost behaviour.
- **Idempotency checks** — 8 tests: group exists, authorised key present, Docker Compose up-to-date, stat always reports unchanged.
- Total ansible tests at phase completion: 438.
---
## Phase 2: Infrastructure API Robustness
**Commit**: included in Phase 2 work
**Scope**: Consolidated three separate API clients behind a shared `APIClient` abstraction and added retry and rate-limit handling.
### Completed work
- **API client abstraction** — Extracted shared `APIClient` struct in `infra/client.go`. `HCloudClient`, `HRobotClient`, and `CloudNSClient` now delegate to `APIClient` via configurable auth functions and error prefixes. Options pattern: `WithHTTPClient`, `WithRetry`, `WithAuth`, `WithPrefix`. Added 30 `client_test.go` tests.
- **Retry logic**`APIClient` implements exponential backoff with jitter. Retries on 5xx responses and transport errors. Does not retry 4xx errors (except 429). `RetryConfig` carries `MaxRetries` (default 3), `InitialBackoff` (100 ms), and `MaxBackoff` (5 s). Context cancellation is respected during backoff sleeps.
- **Rate limiting** — Detects HTTP 429 responses, parses `Retry-After` header (seconds format; falls back to 1 s). Sets a per-`APIClient` `blockedUntil` timestamp guarded by a mutex. All subsequent requests on the instance wait until the window expires. Tests include real 1 s `Retry-After` delays.
- **DigitalOcean references removed** — Investigation confirmed no DigitalOcean types or implementation existed in the codebase. Only stale documentation references were present. Removed from CLAUDE.md and FINDINGS.md. No code changes were required.
---
## Phase 3: Release Pipeline Testing
**Commit**: `032d862`
**Scope**: Complete test coverage for the release pipeline: all eight publishers, SDK orchestration, and breaking change detection.
### Completed work
- **Publisher integration tests** (`integration_test.go`, 48 tests):
- GitHub: dry-run, command-argument building, repository detection, artifact upload.
- Docker: dry-run, buildx argument construction, config parsing.
- Homebrew: dry-run, formula generation, Ruby class naming.
- Scoop: dry-run, manifest JSON generation.
- AUR: dry-run, `PKGBUILD` and `.SRCINFO` generation.
- Chocolatey: dry-run, `.nuspec` generation.
- npm: dry-run, `package.json` generation.
- LinuxKit: dry-run, multi-format and multi-platform.
- Cross-publisher: name uniqueness, nil `relCfg` safety, checksum field mapping, interface compliance.
- **SDK generation tests** (`generation_test.go`, 38 tests):
- SDK orchestration: `Generate` iterates languages, output directory creation, missing-spec error, unknown-language error.
- Generator registry: register, get, overwrite, language listing.
- Interface compliance: language identifier correctness, `Available()` safety, install instruction presence.
- SDK config: defaults, `SetVersion`, nil config safety.
- Spec detection priority: configured path takes precedence over common paths; all 8 common paths checked.
- **Breaking change detection** (`breaking_test.go`, 30 tests):
- oasdiff integration: add-endpoint (non-breaking), remove-endpoint (breaking), add-required-parameter (breaking), add-optional-parameter (non-breaking), change-response-type (breaking), remove-HTTP-method (breaking), identical specs.
- Multiple breaking changes simultaneously.
- JSON spec format support.
- Error handling: non-existent base spec, non-existent revision spec, invalid YAML.
- `DiffExitCode` values: 0 (no diff), 1 (non-breaking), 2 (breaking).
- `DiffResult` summary and human-readable changes.
---
## Phase 4: DevKit Expansion
**Commit**: `e20083d`
**Scope**: Added three new capabilities to `devkit/`: structured vulnerability scanning, native cyclomatic complexity analysis, and coverage trending with persistence.
### Completed work
- **Vulnerability scanning** (`vulncheck.go` + `vulncheck_test.go`, 13 tests):
- `VulnCheck(modulePath)` runs `govulncheck -json` and delegates to `ParseVulnCheckJSON`.
- `ParseVulnCheckJSON` processes newline-delimited JSON, correlating `finding` messages with `osv` metadata. Handles malformed lines, missing OSV entries, empty call traces.
- `VulnFinding` carries: `ID` (GO-2024-xxxx), `Aliases` (CVE/GHSA), `Package`, `CalledFunction`, `Description`, `FixedVersion`, `ModulePath`.
- **Cyclomatic complexity analysis** (`complexity.go` + `complexity_test.go`, 21 tests):
- `AnalyseComplexity(cfg)` walks Go source via `go/ast`. No external tools required.
- `AnalyseComplexitySource(src, filename, threshold)` for in-memory parsing (used in tests).
- Counts: `if`, `for`, `range`, non-default `case`, `select` comm clause, `&&`, `||`, type switch, `select` statement.
- Skips `vendor/`, hidden directories, and `_test.go` files.
- Default threshold: 15.
- **Coverage trending** (`coverage.go` + `coverage_test.go`, 19 tests):
- `ParseCoverProfile(data)` parses `go test -coverprofile` format, computing per-package statement ratios.
- `ParseCoverOutput(output)` parses human-readable `go test -cover` output.
- `CoverageStore` with JSON persistence: `Append`, `Load`, `Latest`.
- `CompareCoverage(previous, current)` diffs snapshots, returning `CoverageComparison` with `Regressions`, `Improvements`, `NewPackages`, `Removed`, and `TotalDelta`.
---
## Known Limitations
### Embedded Python runtime
`deploy/coolify/` uses an embedded Python 3.13 runtime (`github.com/kluctl/go-embed-python`) to run a Python Swagger client against the Coolify PaaS API. This adds approximately 50 MB to binary size. The design trades binary size for zero native Coolify Go client maintenance. A native Go HTTP client would eliminate this dependency but requires writing and maintaining Coolify API type mappings.
### Single-commit extraction history
All code predating 16 February 2026 arrived in a single commit. `git blame` and `git bisect` cannot identify which changes introduced bugs that existed before extraction. When investigating pre-extraction defects, examine the corresponding history in the `core/go` repository.
### Hypervisor platform specificity
`container/hypervisor.go` selects QEMU (Linux) or Hyperkit (macOS) at runtime. Neither hypervisor is available in standard CI environments. Container package tests use mock hypervisors. Real integration testing requires a machine with the hypervisor binary present.
### Ansible: no role resolution
The Ansible engine supports `include_role` and `import_role` task directives syntactically but does not implement file system role discovery (searching `roles/` directories relative to the playbook). Role tasks must be explicitly inlined or included via `include_tasks`.
### Ansible: no vault decryption
Ansible Vault-encrypted variables and files are not decrypted. Playbooks that rely on vault must decrypt values before passing them to the executor or supply plaintext variables at runtime.
### CLI via Cobra (not core/go CLI framework)
`build/buildcmd/` registers `core build` and `core release` sub-commands using `github.com/spf13/cobra` directly rather than the CLI framework from `forge.lthn.ai/core/go`. This creates a dependency divergence. Alignment with the core/go CLI framework is a future consideration.
### DigitalOcean not implemented
DigitalOcean was documented in early drafts of CLAUDE.md and FINDINGS.md but no types or implementation exist. The documentation references were removed in Phase 2. DigitalOcean support would require a new `infra/digitalocean.go` file using the `APIClient` abstraction.
---
## Future Considerations
- **Native Coolify client** — Replace `deploy/python/` and the embedded Python runtime with a native Go HTTP client for the Coolify v1 API. Eliminates the 50 MB runtime penalty and removes the `kluctl/go-embed-python` dependency.
- **Ansible role resolution** — Implement file system role discovery matching the Ansible convention (`roles/<name>/tasks/main.yml` relative to the playbook directory). Required for running the production DevOps playbooks without pre-inlining role tasks.
- **Ansible vault support** — Add vault decryption using the existing `forge.lthn.ai/core/go-crypt` package (which already manages SSH keys). Allow vault password to be supplied via environment variable or file path.
- **SSH alignment with go-crypt**`ansible/ssh.go` uses `golang.org/x/crypto/ssh` directly. The `go-crypt` package provides key management. Aligning the two would centralise SSH key handling across the ecosystem.
- **Cobra to core/go CLI alignment** — Migrate `build/buildcmd/` from direct Cobra usage to the core/go CLI framework used by other commands. This is low risk but requires coordination with the parent CLI command tree.
- **DigitalOcean support** — Add `infra/digitalocean.go` implementing the `APIClient`-based pattern established in Phase 2. Required if Lethean infrastructure migrates workloads to DigitalOcean.
- **Coverage trending integration** — Wire `devkit.CoverageStore` into the CI pipeline to accumulate snapshots across runs and fail builds on regression. A `~/.core/coverage.json` or per-project store path would be natural.
- **Build tag isolation for hypervisor tests** — Add `//go:build linux` and `//go:build darwin` tags to `container/` tests that require platform-specific hypervisors, enabling clean CI across both platforms without mock exceptions.