From 9b55b97b282fee1614d10368102b9ce5413b68d3 Mon Sep 17 00:00:00 2001 From: Snider Date: Fri, 20 Feb 2026 00:59:01 +0000 Subject: [PATCH] docs: add domain expert guide, task queue, and research notes CLAUDE.md: architecture guide for ansible/build/infra/release (29K LOC) TODO.md: 5-phase task queue (test coverage, ansible, infra, release, devkit) FINDINGS.md: package inventory, test gaps, config ecosystem Co-Authored-By: Virgil --- CLAUDE.md | 208 ++++++++++++++++++++++++++++++++++++++++++++++++++++ FINDINGS.md | 78 ++++++++++++++++++++ TODO.md | 49 +++++++++++++ 3 files changed, 335 insertions(+) create mode 100644 CLAUDE.md create mode 100644 FINDINGS.md create mode 100644 TODO.md diff --git a/CLAUDE.md b/CLAUDE.md new file mode 100644 index 0000000..cc0bf48 --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1,208 @@ +# CLAUDE.md — go-devops Domain Expert Guide + +You are a dedicated domain expert for `forge.lthn.ai/core/go-devops`. Virgil (in core/go) orchestrates your work via TODO.md. Pick up tasks in phase order, mark `[x]` when done, commit and push. + +## What This Package Does + +Infrastructure management, build automation, and release pipelines. ~29K LOC across 118 Go files. Provides: + +- **Ansible engine** — Native Go playbook executor (not shelling out to `ansible-playbook`). SSH, modules, facts, handlers. +- **Build system** — Plugin-based builders (Go, Wails, Docker, C++, LinuxKit, Taskfile). Cross-compilation, code signing (macOS/GPG/Windows). +- **Release automation** — Version detection, changelog from git history, multi-target publishing (GitHub, Docker, Homebrew, AUR, Scoop, Chocolatey, npm). +- **Infrastructure APIs** — Hetzner Cloud, Hetzner Robot (bare metal), CloudNS DNS, DigitalOcean. +- **Container/VM management** — LinuxKit images on QEMU (Linux) or Hyperkit (macOS). +- **SDK generation** — OpenAPI spec parsing, TypeScript/Python/Go/PHP client generation, breaking change detection. +- **Developer toolkit** — Code quality metrics, TODO detection, coverage reports, dependency graphs. + +## Commands + +```bash +go test ./... # Run all tests +go test -v -run TestName ./... # Single test +go test -race ./... # Race detector +go vet ./... # Static analysis +``` + +## Local Dependencies + +| Module | Local Path | Notes | +|--------|-----------|-------| +| `forge.lthn.ai/core/go` | `../core` | Framework (core.E, io.Medium, config, i18n, log) | + +**Do NOT change the replace directive path.** Use go.work for local resolution if needed. + +## Architecture + +### ansible/ — Playbook Execution Engine (~3,162 LOC) + +| File | LOC | Purpose | +|------|-----|---------| +| `executor.go` | 1,021 | Playbook runner: task/handler/fact tracking, become/sudo | +| `modules.go` | 1,434 | Module implementations: service, file, template, command, copy, apt, yum | +| `parser.go` | 438 | YAML playbook + inventory parser | +| `ssh.go` | 451 | SSH client connection management | +| `types.go` | 258 | Core types: Play, Task, Handler, Inventory, Facts | + +Executes Ansible playbooks natively in Go. Supports: `when` conditionals, `register` variables, `notify` handlers, `become` privilege escalation, `loop` iteration, fact gathering. + +### build/ — Project Building & Cross-Compilation (~3,637 LOC) + +**Root** (797 LOC): Project type detection, archive creation (tar.gz/xz/zip via Borg compression), config from `.core/build.yaml`, SHA checksums. + +**builders/** (1,390 LOC): Plugin interface `Builder.Build()`. +| Builder | LOC | Notes | +|---------|-----|-------| +| `go.go` | — | Go cross-compilation | +| `wails.go` | 247 | Wails desktop app | +| `docker.go` | 215 | Docker image build | +| `cpp.go` | 253 | CMake C++ | +| `linuxkit.go` | 270 | LinuxKit VM image | +| `taskfile.go` | 275 | Taskfile automation | + +**signing/** (377 LOC): Signer interface. macOS `codesign`, GPG, Windows `signtool`. + +**buildcmd/** (1,053 LOC): CLI handlers for `core build`, `core build pwa`, `core build sdk`, `core release`. + +### container/ — LinuxKit VM Management (~1,208 LOC) + +| File | LOC | Purpose | +|------|-----|---------| +| `container.go` | 106 | Manager interface + Container model | +| `linuxkit.go` | 462 | LinuxKitManager: Run, Stop, List | +| `hypervisor.go` | 273 | Abstraction: QEMU (Linux) / Hyperkit (macOS) | +| `state.go` | 172 | Container state persistence (`~/.core/state.json`) | +| `templates.go` | 301 | Packer/LinuxKit template rendering | + +### devops/ — Portable Dev Environment (~1,216 LOC) + +| File | LOC | Purpose | +|------|-----|---------| +| `devops.go` | 243 | Manager: install, boot, stop, status | +| `config.go` | 90 | Config from `~/.core/config.yaml` | +| `images.go` | 198 | ImageManager: download from GitHub/CDN/registry | +| `shell.go` | 74 | Shell execution wrapper | +| `test.go` | 188 | Test execution helpers | +| `serve.go` | 109 | Dev environment HTTP server | +| `claude.go` | 143 | Claude/AI integration | +| `ssh_utils.go` | 68 | SSH key scanning | + +**sources/** (218 LOC): `ImageSource` interface. GitHub Releases + S3/CDN download sources. + +### infra/ — Infrastructure APIs (~953 LOC) + +| File | LOC | Purpose | +|------|-----|---------| +| `config.go` | 300 | `infra.yaml` types: Host, LoadBalancer, Network, DNS, Database, Cache | +| `hetzner.go` | 381 | Hetzner Cloud API (VPS) + Hetzner Robot API (bare metal) | +| `cloudns.go` | 272 | CloudNS DNS: zones, records, ACME DNS-01 challenges | + +### release/ — Release Automation (~4,008 LOC) + +**Root** (1,398 LOC): Release orchestrator (version → build → changelog → publish), config from `.core/release.yaml`, git-based changelog, semver detection. + +**publishers/** (2,610 LOC): `Publisher` interface. +| Publisher | LOC | Notes | +|-----------|-----|-------| +| `github.go` | 233 | GitHub Releases | +| `docker.go` | 278 | Docker image build + push | +| `homebrew.go` | 371 | Homebrew formula | +| `npm.go` | 265 | npm registry | +| `aur.go` | 313 | Arch Linux AUR | +| `scoop.go` | 284 | Windows Scoop | +| `chocolatey.go` | 294 | Windows Chocolatey | +| `linuxkit.go` | 300 | LinuxKit image | + +### sdk/ — OpenAPI SDK Generation (~931 LOC) + +Auto-detect OpenAPI spec, generate typed clients in 4 languages, detect breaking changes via oasdiff. + +**generators/** (437 LOC): TypeScript, Python, Go, PHP generators. + +### devkit/ — Developer Toolkit (~560 LOC) + +Code quality analysis: TODOs/FIXMEs, coverage reports, race conditions, vulnerability detection, secret leak scanning, cyclomatic complexity, dependency graphs. + +### deploy/ — Deployment Integrations (~366 LOC) + +- **python/** — Embedded Python 3.13 runtime (kluctl/go-embed-python) +- **coolify/** — Coolify PaaS API client via Python Swagger + +## Key Interfaces + +```go +// build/builders/ +type Builder interface { + Name() string + Detect(fs io.Medium, dir string) (bool, error) + Build(ctx context.Context, cfg *Config, targets []Target) ([]Artifact, error) +} + +// release/publishers/ +type Publisher interface { + Name() string + Publish(ctx context.Context, release *Release, pubCfg PublisherConfig, relCfg ReleaseConfig, dryRun bool) error +} + +// container/ +type Hypervisor interface { + Name() string + Available() bool + Run(ctx context.Context, opts RunOptions) (*process.Handle, error) +} + +// devops/sources/ +type ImageSource interface { + Name() string + Available() bool + Download(ctx context.Context, name, version string, progress func(downloaded, total int64)) (string, error) +} + +// build/signing/ +type Signer interface { + Name() string + Available() bool + Sign(filePath, keyID string) ([]byte, error) +} + +// sdk/generators/ +type Generator interface { + Language() string + Generate(ctx context.Context, spec, outputDir string, config *Config) error +} +``` + +## External Dependencies + +| Package | Purpose | +|---------|---------| +| `github.com/Snider/Borg` | Compression (xz) for archives. **Not** Secure/Blob/Pointer. | +| `github.com/getkin/kin-openapi` | OpenAPI 3.x spec parsing | +| `github.com/oasdiff/oasdiff` | API breaking change detection | +| `github.com/kluctl/go-embed-python` | Embedded Python 3.13 runtime | +| `github.com/spf13/cobra` | CLI framework for build/release commands | +| `golang.org/x/crypto` | SSH connections (ansible/) | + +## Configuration Files + +- `.core/build.yaml` — Build targets, ldflags, signing, archive format +- `.core/release.yaml` — Version source, changelog style, SDK langs, publisher configs +- `infra.yaml` — Host inventory, DNS zones, cloud provider settings +- `~/.core/config.yaml` — Local dev environment config + +## Coding Standards + +- **UK English**: colour, organisation, centre +- **Tests**: testify assert/require, `_Good`/`_Bad`/`_Ugly` naming convention +- **Conventional commits**: `feat(ansible):`, `fix(infra):`, `refactor(build):` +- **Co-Author**: `Co-Authored-By: Virgil ` +- **Licence**: EUPL-1.2 +- **Imports**: stdlib → forge.lthn.ai → third-party, each group separated by blank line + +## Forge + +- **Repo**: `forge.lthn.ai/core/go-devops` +- **Push via SSH**: `git push forge main` (remote: `ssh://git@forge.lthn.ai:2223/core/go-devops.git`) + +## Task Queue + +See `TODO.md` for prioritised work. See `FINDINGS.md` for research notes. diff --git a/FINDINGS.md b/FINDINGS.md new file mode 100644 index 0000000..97ce48f --- /dev/null +++ b/FINDINGS.md @@ -0,0 +1,78 @@ +# FINDINGS.md — go-devops Research & Discovery + +## 2026-02-20: Initial Analysis (Virgil) + +### Origin + +Extracted from `core/go` on 16 Feb 2026 (commit `392ad68`). Single extraction commit — fresh repo. + +### Package Inventory + +| Package | Files | Source LOC | Test Files | Notes | +|---------|-------|-----------|-----------|-------| +| `ansible/` | 5 | 3,162 | 1 | Playbook executor, SSH, modules, parser | +| `build/` | 6 | 797 | 4 | Project detection, archives, checksums, config | +| `build/builders/` | 6 | 1,390 | — | Go, Wails, Docker, C++, LinuxKit, Taskfile | +| `build/signing/` | 5 | 377 | — | macOS, GPG, Windows signtool | +| `build/buildcmd/` | 6 | 1,053 | — | CLI command handlers | +| `container/` | 5 | 1,208 | 4 | LinuxKit VMs, hypervisor abstraction, state | +| `deploy/python/` | 1 | 147 | — | Embedded Python 3.13 | +| `deploy/coolify/` | 1 | 219 | — | Coolify PaaS API client | +| `devkit/` | 1 | 560 | 1 | Code quality metrics | +| `devops/` | 8 | 1,216 | 8 | Dev environment manager | +| `devops/sources/` | 3 | 218 | — | GitHub/CDN image sources | +| `infra/` | 3 | 953 | 1 | Hetzner, CloudNS, config | +| `release/` | 5 | 1,398 | 5 | Release orchestrator | +| `release/publishers/` | 9 | 2,610 | 9 | 8 target platforms | +| `sdk/` | 3 | 494 | 3 | OpenAPI detection + diff | +| `sdk/generators/` | 5 | 437 | 5 | 4-language SDK gen | + +**Total**: ~29K LOC across 71 source files + 47 test files + +### Key Observations + +1. **ansible/modules.go is the largest file** — 1,434 LOC implementing Ansible modules in pure Go. Zero tests. Highest-priority testing gap. + +2. **Borg dependency is compression-only** — `github.com/Snider/Borg` used for xz archive creation in `build/archive.go`. Does NOT use the Secure/Blob/Pointer features. + +3. **Python 3.13 embedded** — `deploy/python/` embeds a full Python runtime via kluctl/go-embed-python. Used exclusively for Coolify API client (Python Swagger). Consider replacing with native Go HTTP client to remove the 50MB+ Python dependency. + +4. **DigitalOcean gap** — Referenced in `infra/config.go` types but no `digitalocean.go` implementation exists. Either implement or remove the dead types. + +5. **Single-commit repo** — Entire codebase arrived in one `feat: extract` commit. No git history for individual components. This makes blame/bisect impossible for bugs originating before extraction. + +6. **Hypervisor platform detection** — `container/hypervisor.go` auto-selects QEMU on Linux, Hyperkit on macOS. Both are platform-specific — tests may need build tags or mocking. + +7. **CLI via Cobra** — `build/buildcmd/` uses Cobra directly (not core/go's CLI framework). May need alignment. + +8. **8 release publishers** — GitHub, Docker, Homebrew, npm, AUR, Scoop, Chocolatey, LinuxKit. All implement the `Publisher` interface. Each is ~250-370 LOC. All have test files. + +### Test Coverage Gaps + +| Package | Gap Severity | Notes | +|---------|-------------|-------| +| `ansible/modules.go` | **Critical** | 1,434 LOC, zero tests | +| `ansible/executor.go` | **Critical** | 1,021 LOC, zero tests | +| `ansible/parser.go` | High | 438 LOC, zero tests | +| `infra/hetzner.go` | High | 381 LOC, zero tests — API calls untested | +| `infra/cloudns.go` | High | 272 LOC, zero tests — DNS ops untested | +| `build/builders/*` | Medium | 1,390 LOC, no individual builder tests | +| `build/signing/*` | Medium | 377 LOC, signing logic untested | +| `deploy/*` | Low | 366 LOC, Python/Coolify integration | + +### Integration Points + +- **core/go** → Framework (core.E, io.Medium, config, logging) +- **core/go-crypt** → SSH key management (ansible/ssh.go uses golang.org/x/crypto directly, could use go-crypt) +- **core/cli** → Build/release commands registered via Cobra +- **DevOps repo** → `infra.yaml` config used by Ansible playbooks in `/Users/snider/Code/DevOps` + +### Config File Ecosystem + +| File | Location | Purpose | +|------|----------|---------| +| `.core/build.yaml` | Project root | Build targets, signing, archives | +| `.core/release.yaml` | Project root | Version, changelog, publishers | +| `infra.yaml` | Project root | Host inventory, DNS, cloud providers | +| `~/.core/config.yaml` | User home | Local dev environment config | +| `~/.core/state.json` | User home | Container/VM state persistence | diff --git a/TODO.md b/TODO.md new file mode 100644 index 0000000..3e9fe8b --- /dev/null +++ b/TODO.md @@ -0,0 +1,49 @@ +# TODO.md — go-devops + +Dispatched from core/go orchestration. Pick up tasks in order. + +--- + +## Phase 0: Test Coverage & Hardening + +- [ ] **Expand ansible/ tests** — Only `ssh_test.go` exists. Add: `executor_test.go` (run a minimal playbook with mock SSH, verify task order + handler notification), `modules_test.go` (test each module: service start/stop, file copy, template render, command exec — use mocked SSH session), `parser_test.go` (parse valid playbook YAML, invalid YAML, empty plays, nested vars), `types_test.go` (Facts merge, Inventory host grouping). +- [ ] **Expand infra/ tests** — Only `config_test.go` exists. Add: `hetzner_test.go` (mock HTTP responses for server list/create/delete, load balancer ops, snapshot management), `cloudns_test.go` (mock DNS zone/record CRUD, ACME challenge create/cleanup, error responses). Use `httptest.NewServer` for API mocking. +- [ ] **Expand build/ tests** — Add: builder detection tests (each builder's `Detect()` with matching/non-matching directory structures), archive round-trip (create tar.gz → extract → compare), signing mock tests (verify `Sign()` called with correct paths). +- [ ] **Expand release/ tests** — Add: version detection from git tags / package.json / go.mod, changelog generation from conventional commits (mock git log output), publisher dry-run tests. +- [ ] **Race condition tests** — `go test -race ./...` across all packages. Ansible executor runs concurrent handlers — verify thread safety. +- [ ] **`go vet ./...` clean** — Fix any warnings. + +## Phase 1: Ansible Engine Hardening + +- [ ] **Module test coverage** — `modules.go` is 1,434 LOC with zero tests. Each module (service, file, template, command, copy, apt, yum) needs unit tests with mocked SSH sessions. +- [ ] **Error propagation** — Verify all SSH errors are wrapped with `core.E()` including host context. Currently some errors may lose the host identifier. +- [ ] **Fact gathering** — Test fact collection from different Linux distros (Ubuntu, CentOS, Alpine). Mock `/etc/os-release` parsing. +- [ ] **Become/sudo** — Test privilege escalation paths. Verify password prompt handling. +- [ ] **Idempotency checks** — Modules should report `changed: false` when no action needed. Verify for file, service, template modules. + +## Phase 2: Infrastructure API Robustness + +- [ ] **Retry logic** — Add configurable retry with exponential backoff for Hetzner Cloud/Robot and CloudNS API calls. Cloud APIs are flaky. +- [ ] **Rate limiting** — Hetzner Cloud has rate limits. Detect 429 responses, queue and retry. +- [ ] **DigitalOcean support** — Currently referenced in config but no implementation. Either implement or remove. +- [ ] **API client abstraction** — Extract common HTTP client pattern from hetzner.go and cloudns.go into shared infra client. + +## Phase 3: Release Pipeline Testing + +- [ ] **Publisher integration tests** — Mock GitHub API for release creation, Docker registry for image push, Homebrew tap for formula update. Verify dry-run mode produces correct output without side effects. +- [ ] **SDK generation tests** — Generate TypeScript/Go/Python clients from a test OpenAPI spec. Verify output compiles/type-checks. +- [ ] **Breaking change detection** — Test oasdiff integration: modify a spec with breaking change, verify detection and failure mode. + +## Phase 4: DevKit Expansion + +- [ ] **Vulnerability scanning** — Integrate `govulncheck` output parsing into devkit findings. +- [ ] **Complexity thresholds** — Configurable cyclomatic complexity threshold. Flag functions exceeding it. +- [ ] **Coverage trending** — Store coverage snapshots, detect regressions between runs. + +--- + +## Workflow + +1. Virgil in core/go writes tasks here after research +2. This repo's dedicated session picks up tasks in phase order +3. Mark `[x]` when done, note commit hash