14 KiB
Project History — go-devops
Origin
go-devops was extracted from the forge.lthn.ai/core/go monorepo on 16 February 2026. The entire codebase arrived in a single extraction commit and was pushed to its own Forge repository (core/go-devops). This means git blame and bisect cannot distinguish the history of individual components prior to the extraction date; all pre-extraction bugs are outside the revision graph.
Extraction commit: the repository's first commit (feat: extract) contains the full initial codebase — approximately 29,000 lines across 71 source files and 47 test files spanning 16 packages.
Phase 0: Test Coverage and Hardening
Commit: 6e346cb, 5d22ed9
Scope: Established a baseline test suite across the packages with the most critical coverage gaps at extraction.
Completed work
-
ansible/ tests — Added
parser_test.go(17 tests coveringParsePlaybook,ParseInventory,ParseTasks,GetHosts,GetHostVars,isModule,NormalizeModule),types_test.go(coveringRoleRef/TaskUnmarshalYAML,Inventory,Facts,TaskResult,KnownModules), andexecutor_test.go(coveringgetHosts,matchesTags,evaluateWhen,templateString,applyFilter,resolveLoop,templateArgs,handleNotify,normalizeConditions, and helper functions). -
infra/ tests — Added
hetzner_test.go(coveringHCloudClient/HRobotClientconstruction,do()round-trip viahttptest, API error handling, and JSON serialisation forHCloudServer,HCloudLoadBalancer,HRobotServer) andcloudns_test.go(coveringdoRaw()round-trip, zone/record JSON, CRUD responses, ACME challenge, auth parameters, and errors). -
build/ tests — Added
archive_test.go(249 LOC, archive round-trip for tar.gz and zip, multi-file archives) and extendedsigning_test.go(+181 LOC with mock signer tests, path verification, and error handling). -
release/ nil guard — Fixed a nil pointer crash in
release/publishers/linuxkit.goline 50. Added arelease.FS == nilguard. Added a corresponding nil FS test case tolinuxkit_test.go(+23 LOC). Total test count across build/ and release/ reached 862. -
Race detector —
go test -race ./...confirmed clean acrossansible/,infra/,container/,devops/, andbuild/packages. -
go vet ./...— Fixed stale API calls incontainer/linuxkit_test.go,state_test.go,templates_test.go, anddevops/devops_test.go. Fixed thego.modreplacedirective.
Phase 1: Ansible Engine Hardening
Commits: 3330e55, c7da9ad, 9638e77, 427929f, 8ab8643
Scope: Brought the Ansible engine from zero test coverage to comprehensive coverage across all module categories, SSH infrastructure, and executor logic.
Step 1.0: SSH mock infrastructure (3330e55)
Created ansible/mock_ssh_test.go providing:
MockSSHClientwith a command registry (expectCommand), in-memory filesystem, become-state tracking, and execution/upload logs.- Assertion helpers:
hasExecuted,hasExecutedMethod,findExecuted. - Module shims via the
sshRunnerinterface to decouple module functions from real SSH connections. - 12 mock infrastructure tests confirming the mock behaves correctly in isolation.
Step 1.1: Command execution modules (3330e55)
36 module tests covering command, shell, raw, and script. Verified: command uses Run(), shell uses RunScript(), raw passes through unmodified, script reads a local file before uploading. Cross-module differentiation and dispatch routing tests included. Total ansible tests at this point: 48.
Step 1.2: File operation modules (c7da9ad)
54 new tests across copy (8), file (12), lineinfile (8), blockinfile (7), stat (5), template (6), dispatch (6), and integration (2). Extended the mock with an sshFileRunner interface and 6 module shims. Fixed an unsupported-module test (copy to hostname). Total ansible tests: 208.
Step 1.3: Service and package modules (9638e77)
56 new tests across service (12), systemd (4), apt (9), apt_key (6), apt_repository (8), package (3), pip (8), and dispatch (7). 7 new module shims added to mock_ssh_test.go.
Step 1.4: User, group, and advanced modules (427929f)
69 new tests across user (7), group (7), cron (5), authorized_key (7), git (8), unarchive (8), uri (6), ufw (8), docker_compose (7), and dispatch (6). 9 module shims. Total ansible tests: 334.
Step 1.5: Error propagation, become, facts, idempotency (8ab8643)
- Error propagation — 68 tests across
getHosts,matchesTags,evaluateWhen/evalCondition,templateString,applyFilter,resolveLoop,handleNotify,normalizeConditions, and cross-cutting scenarios. - Become/sudo — 8 tests: enable/disable cycle, default user, passwordless sudo, play-level become.
- Fact gathering — 9 tests: Ubuntu, CentOS, Alpine, and Debian
os-releaseparsing, hostname, and localhost behaviour. - Idempotency checks — 8 tests: group exists, authorised key present, Docker Compose up-to-date, stat always reports unchanged.
- Total ansible tests at phase completion: 438.
Phase 2: Infrastructure API Robustness
Commit: included in Phase 2 work
Scope: Consolidated three separate API clients behind a shared APIClient abstraction and added retry and rate-limit handling.
Completed work
-
API client abstraction — Extracted shared
APIClientstruct ininfra/client.go.HCloudClient,HRobotClient, andCloudNSClientnow delegate toAPIClientvia configurable auth functions and error prefixes. Options pattern:WithHTTPClient,WithRetry,WithAuth,WithPrefix. Added 30client_test.gotests. -
Retry logic —
APIClientimplements exponential backoff with jitter. Retries on 5xx responses and transport errors. Does not retry 4xx errors (except 429).RetryConfigcarriesMaxRetries(default 3),InitialBackoff(100 ms), andMaxBackoff(5 s). Context cancellation is respected during backoff sleeps. -
Rate limiting — Detects HTTP 429 responses, parses
Retry-Afterheader (seconds format; falls back to 1 s). Sets a per-APIClientblockedUntiltimestamp guarded by a mutex. All subsequent requests on the instance wait until the window expires. Tests include real 1 sRetry-Afterdelays. -
DigitalOcean references removed — Investigation confirmed no DigitalOcean types or implementation existed in the codebase. Only stale documentation references were present. Removed from CLAUDE.md and FINDINGS.md. No code changes were required.
Phase 3: Release Pipeline Testing
Commit: 032d862
Scope: Complete test coverage for the release pipeline: all eight publishers, SDK orchestration, and breaking change detection.
Completed work
-
Publisher integration tests (
integration_test.go, 48 tests):- GitHub: dry-run, command-argument building, repository detection, artifact upload.
- Docker: dry-run, buildx argument construction, config parsing.
- Homebrew: dry-run, formula generation, Ruby class naming.
- Scoop: dry-run, manifest JSON generation.
- AUR: dry-run,
PKGBUILDand.SRCINFOgeneration. - Chocolatey: dry-run,
.nuspecgeneration. - npm: dry-run,
package.jsongeneration. - LinuxKit: dry-run, multi-format and multi-platform.
- Cross-publisher: name uniqueness, nil
relCfgsafety, checksum field mapping, interface compliance.
-
SDK generation tests (
generation_test.go, 38 tests):- SDK orchestration:
Generateiterates languages, output directory creation, missing-spec error, unknown-language error. - Generator registry: register, get, overwrite, language listing.
- Interface compliance: language identifier correctness,
Available()safety, install instruction presence. - SDK config: defaults,
SetVersion, nil config safety. - Spec detection priority: configured path takes precedence over common paths; all 8 common paths checked.
- SDK orchestration:
-
Breaking change detection (
breaking_test.go, 30 tests):- oasdiff integration: add-endpoint (non-breaking), remove-endpoint (breaking), add-required-parameter (breaking), add-optional-parameter (non-breaking), change-response-type (breaking), remove-HTTP-method (breaking), identical specs.
- Multiple breaking changes simultaneously.
- JSON spec format support.
- Error handling: non-existent base spec, non-existent revision spec, invalid YAML.
DiffExitCodevalues: 0 (no diff), 1 (non-breaking), 2 (breaking).DiffResultsummary and human-readable changes.
Phase 4: DevKit Expansion
Commit: e20083d
Scope: Added three new capabilities to devkit/: structured vulnerability scanning, native cyclomatic complexity analysis, and coverage trending with persistence.
Completed work
-
Vulnerability scanning (
vulncheck.go+vulncheck_test.go, 13 tests):VulnCheck(modulePath)runsgovulncheck -jsonand delegates toParseVulnCheckJSON.ParseVulnCheckJSONprocesses newline-delimited JSON, correlatingfindingmessages withosvmetadata. Handles malformed lines, missing OSV entries, empty call traces.VulnFindingcarries:ID(GO-2024-xxxx),Aliases(CVE/GHSA),Package,CalledFunction,Description,FixedVersion,ModulePath.
-
Cyclomatic complexity analysis (
complexity.go+complexity_test.go, 21 tests):AnalyseComplexity(cfg)walks Go source viago/ast. No external tools required.AnalyseComplexitySource(src, filename, threshold)for in-memory parsing (used in tests).- Counts:
if,for,range, non-defaultcase,selectcomm clause,&&,||, type switch,selectstatement. - Skips
vendor/, hidden directories, and_test.gofiles. - Default threshold: 15.
-
Coverage trending (
coverage.go+coverage_test.go, 19 tests):ParseCoverProfile(data)parsesgo test -coverprofileformat, computing per-package statement ratios.ParseCoverOutput(output)parses human-readablego test -coveroutput.CoverageStorewith JSON persistence:Append,Load,Latest.CompareCoverage(previous, current)diffs snapshots, returningCoverageComparisonwithRegressions,Improvements,NewPackages,Removed, andTotalDelta.
Known Limitations
Embedded Python runtime
deploy/coolify/ uses an embedded Python 3.13 runtime (github.com/kluctl/go-embed-python) to run a Python Swagger client against the Coolify PaaS API. This adds approximately 50 MB to binary size. The design trades binary size for zero native Coolify Go client maintenance. A native Go HTTP client would eliminate this dependency but requires writing and maintaining Coolify API type mappings.
Single-commit extraction history
All code predating 16 February 2026 arrived in a single commit. git blame and git bisect cannot identify which changes introduced bugs that existed before extraction. When investigating pre-extraction defects, examine the corresponding history in the core/go repository.
Hypervisor platform specificity
container/hypervisor.go selects QEMU (Linux) or Hyperkit (macOS) at runtime. Neither hypervisor is available in standard CI environments. Container package tests use mock hypervisors. Real integration testing requires a machine with the hypervisor binary present.
Ansible: no role resolution
The Ansible engine supports include_role and import_role task directives syntactically but does not implement file system role discovery (searching roles/ directories relative to the playbook). Role tasks must be explicitly inlined or included via include_tasks.
Ansible: no vault decryption
Ansible Vault-encrypted variables and files are not decrypted. Playbooks that rely on vault must decrypt values before passing them to the executor or supply plaintext variables at runtime.
CLI via Cobra (not core/go CLI framework)
build/buildcmd/ registers core build and core release sub-commands using github.com/spf13/cobra directly rather than the CLI framework from forge.lthn.ai/core/go. This creates a dependency divergence. Alignment with the core/go CLI framework is a future consideration.
DigitalOcean not implemented
DigitalOcean was documented in early drafts of CLAUDE.md and FINDINGS.md but no types or implementation exist. The documentation references were removed in Phase 2. DigitalOcean support would require a new infra/digitalocean.go file using the APIClient abstraction.
Future Considerations
-
Native Coolify client — Replace
deploy/python/and the embedded Python runtime with a native Go HTTP client for the Coolify v1 API. Eliminates the 50 MB runtime penalty and removes thekluctl/go-embed-pythondependency. -
Ansible role resolution — Implement file system role discovery matching the Ansible convention (
roles/<name>/tasks/main.ymlrelative to the playbook directory). Required for running the production DevOps playbooks without pre-inlining role tasks. -
Ansible vault support — Add vault decryption using the existing
forge.lthn.ai/core/go-cryptpackage (which already manages SSH keys). Allow vault password to be supplied via environment variable or file path. -
SSH alignment with go-crypt —
ansible/ssh.gousesgolang.org/x/crypto/sshdirectly. Thego-cryptpackage provides key management. Aligning the two would centralise SSH key handling across the ecosystem. -
Cobra to core/go CLI alignment — Migrate
build/buildcmd/from direct Cobra usage to the core/go CLI framework used by other commands. This is low risk but requires coordination with the parent CLI command tree. -
DigitalOcean support — Add
infra/digitalocean.goimplementing theAPIClient-based pattern established in Phase 2. Required if Lethean infrastructure migrates workloads to DigitalOcean. -
Coverage trending integration — Wire
devkit.CoverageStoreinto the CI pipeline to accumulate snapshots across runs and fail builds on regression. A~/.core/coverage.jsonor per-project store path would be natural. -
Build tag isolation for hypervisor tests — Add
//go:build linuxand//go:build darwintags tocontainer/tests that require platform-specific hypervisors, enabling clean CI across both platforms without mock exceptions.