Commit graph

10 commits

Author SHA1 Message Date
Snider
c9678fc04a Secure SSH and TLS connections, and fix CI issues
Addresses security concerns from OWASP audit and CodeQL by enforcing strict
host key verification and TLS certificate verification.

Security Changes:
- Enforced strict SSH host key checking in pkg/container and devops.
- Removed insecure SSH host key verification from pkg/ansible.
- Added synchronous host key discovery during VM boot using ssh-keyscan.
- Updated UniFi client to enforce TLS certificate verification by default.
- Added --insecure flag and config option for UniFi to allow opt-in to
  skipping TLS verification for self-signed certificates.

CI and Maintenance:
- Fixed auto-merge workflow by providing repository context to 'gh' command.
- Resolved merge conflicts in .github/workflows/auto-merge.yml.
- Added unit tests for secured Ansible SSH client.
- Fixed formatting issues identified by QA checks.
2026-02-05 03:48:42 +00:00
Snider
58bc2bceae
Merge branch 'dev' into fix/ssh-security-13442055821003769195 2026-02-05 03:46:47 +00:00
Snider
232bedf05f
Add Architecture Decision Records (ADRs) (#312)
* docs: add Architecture Decision Records (ADRs)

Established a system for documenting architectural decisions.
- Created docs/adr directory
- Added ADR template (0000-template.md)
- Established ADR process in docs/adr/README.md
- Documented 4 key existing decisions (0001-0004)
- Integrated ADRs into mkdocs.yml and docs/index.md

* docs: add Architecture Decision Records (ADRs)

Established a system for documenting architectural decisions.
- Created docs/adr directory
- Added ADR template (0000-template.md)
- Established ADR process in docs/adr/README.md
- Documented 4 key existing decisions (0001-0004)
- Integrated ADRs into mkdocs.yml and docs/index.md
- Fixed formatting in pkg/io/local/client.go

* docs: add ADRs and fix auto-merge CI

- Added Architecture Decision Records (ADRs) to docs/adr/
- Integrated ADRs into mkdocs.yml and docs/index.md
- Localized .github/workflows/auto-merge.yml to fix "fatal: not a git repository" error in the reusable workflow by adding explicit --repo context.
2026-02-05 03:43:16 +00:00
Snider
4ef698cbe3 Secure SSH, fix CI auto-merge, and resolve merge conflicts
This commit addresses the OWASP security audit by enforcing strict host key
verification and resolves persistent CI issues.

Security Changes:
- Replaced StrictHostKeyChecking=accept-new with yes in pkg/container and devops.
- Removed insecure host key verification from pkg/ansible.
- Implemented synchronous host key discovery using ssh-keyscan during VM boot.
- Updated Boot lifecycle to wait for host key verification.
- Handled missing known_hosts file in pkg/ansible.
- Refactored hardcoded SSH port to DefaultSSHPort constant.

CI and Maintenance:
- Fixed auto-merge.yml by inlining the script and adding repository context
  to 'gh' command, resolving the "not a git repository" error in CI.
- Resolved merge conflicts in .github/workflows/auto-merge.yml with dev branch.
- Added pkg/ansible/ssh_test.go for SSH client verification.
- Fixed formatting in pkg/io/local/client.go to pass QA checks.
2026-02-05 03:40:28 +00:00
Snider
d1616480eb Secure SSH commands and fix auto-merge CI failure
Addresses OWASP security audit by enforcing strict host key verification
and fixes a CI failure in the auto-merge workflow.

Key changes:
- Replaced StrictHostKeyChecking=accept-new with yes in pkg/container and pkg/devops.
- Removed insecure host key verification from pkg/ansible.
- Implemented synchronous host key discovery using ssh-keyscan during VM boot.
- Handled missing known_hosts file in pkg/ansible.
- Refactored hardcoded SSH port to DefaultSSHPort constant.
- Added pkg/ansible/ssh_test.go to verify SSH client initialization.
- Fixed formatting in pkg/io/local/client.go.
- Fixed auto-merge.yml by inlining the script and providing repository context
  to 'gh' command, resolving the "not a git repository" error in CI.
2026-02-05 03:26:50 +00:00
Snider
a135ba3c58
fix(ci): gofmt base branch + auto-merge permissions (#323)
* style(io): fix gofmt formatting in local client

Remove extra blank line that causes QA fmt check to fail on all PRs.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(auto-merge): add required permissions to workflow caller

The thin caller was missing contents:write and pull-requests:write
permissions, causing startup_failure on every auto-merge run.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 18:38:41 +00:00
Snider
21eda6b448
ci(workflows): replace inline pr-gate and auto-merge with org reusable callers (#303)
Moves the logic to host-uk/.github org-wide reusable workflows.
Fixes org-gate failure: uses author_association from webhook payload
instead of checkMembershipForUser (GITHUB_TOKEN lacks org scope).

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 17:51:00 +00:00
Snider
7be325302f
Migrate pkg/release to io.Medium abstraction (#290)
* chore(io): migrate pkg/release to io.Medium abstraction

Migrated `pkg/release` and its subpackages to use the `io.Medium` abstraction for filesystem operations. This enables better testability and support for alternative storage backends.

Changes:
- Added `FS io.Medium` field to `release.Release` and `publishers.Release` structs.
- Updated `LoadConfig`, `ConfigExists`, and `WriteConfig` in `pkg/release/config.go` to accept `io.Medium`.
- Updated `Publish`, `Run`, `findArtifacts`, and `buildArtifacts` in `pkg/release/release.go` to use `io.Medium`.
- Migrated all publishers (`aur`, `chocolatey`, `docker`, `github`, `homebrew`, `linuxkit`, `npm`, `scoop`) to use `io.Medium` for file operations.
- Implemented custom template overrides in publishers by checking for templates in `.core/templates/<publisher>/` via `io.Medium`.
- Updated all relevant tests to provide `io.Medium`.

* chore(io): fix missing callers in pkg/release migration

Updated callers of `release` package functions that had their signatures changed during the `io.Medium` migration.

Fixed files:
- `internal/cmd/ci/cmd_init.go`
- `internal/cmd/ci/cmd_publish.go`
- `pkg/build/buildcmd/cmd_release.go`

These changes ensure the project compiles successfully by providing `io.Local` to `LoadConfig`, `WriteConfig`, and `ConfigExists`.

* chore(io): fix build errors in pkg/release migration

Fixed compilation errors by updating all callers of `release.LoadConfig`, `release.ConfigExists`, and `release.WriteConfig` to provide the required `io.Medium` argument.

Files updated:
- `internal/cmd/ci/cmd_init.go`
- `internal/cmd/ci/cmd_publish.go`
- `pkg/build/buildcmd/cmd_release.go`

These entry points now correctly pass `io.Local` to the `release` package functions.
2026-02-04 15:07:13 +00:00
Snider
2e6a0f4ef3
chore(io): Migrate pkg/cli to Medium abstraction (#285)
* chore(io): Migrate pkg/cli to Medium abstraction

- Update `PIDFile` struct to include `io.Medium` field.
- Update `NewPIDFile` signature to accept `io.Medium`.
- Update `PIDFile` methods to use injected medium instead of `io.Local`.
- Add `Medium` field to `DaemonOptions`.
- Update `NewDaemon` to default to `io.Local` if no medium is provided.
- Update `pkg/cli/daemon_test.go` to reflect changes and add mock medium tests.

* chore(io): Migrate pkg/cli to Medium abstraction

- Update `PIDFile` struct to include `io.Medium` field.
- Update `NewPIDFile` signature to accept `io.Medium`.
- Update `PIDFile` methods to use injected medium instead of `io.Local`.
- Add `Medium` field to `DaemonOptions`.
- Update `NewDaemon` to default to `io.Local` if no medium is provided.
- Update `pkg/cli/daemon_test.go` to reflect changes and add mock medium tests.
- Fix flaky test `TestLinuxKitManager_Stop_Good_ContextCancelled` by checking context at the start of `Stop`.
- Add fail-fast context checks to all `LinuxKitManager` methods taking a context.
2026-02-04 14:33:33 +00:00
Snider
ce7d7e5866
feat(ci): auto-merge pipeline, org gate, and QA fix hints (#284)
* refactor(core): decompose Core into serviceManager + messageBus (#215)

Extract two focused, unexported components from the Core "god object":

- serviceManager: owns service registry, lifecycle tracking (startables/
  stoppables), and service lock
- messageBus: owns IPC action dispatch, query handling, and task handling

All public API methods on Core become one-line delegation wrappers.
Zero consumer changes — no files outside pkg/framework/core/ modified.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(core): remove unused fields from test struct

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(core): address review feedback from Gemini and Copilot

- Move locked check inside mutex in registerService to fix TOCTOU race
- Add mutex guards to enableLock and applyLock methods
- Replace fmt.Errorf with errors.Join in action() for correct error
  aggregation (consistent with queryAll and lifecycle methods)
- Add TestMessageBus_Action_Bad for error aggregation coverage

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* ci(workflows): bump host-uk/build from v3 to v4

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* ci(workflows): replace Wails build with Go CLI build

The build action doesn't yet support Wails v3. Comment out the GUI
build step and use host-uk/build/actions/setup/go for Go toolchain
setup with a plain `go build` for the CLI binary.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(container): check context before select in Stop to fix flaky test

Stop() now checks ctx.Err() before entering the select block. When a
pre-cancelled context is passed, the select could non-deterministically
choose <-done over <-ctx.Done() if the process had already exited,
causing TestLinuxKitManager_Stop_Good_ContextCancelled to fail on CI.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(ci): trim CodeQL matrix to valid languages

Remove javascript-typescript and actions from CodeQL matrix — this
repo contains only Go and Python. Invalid languages blocked SARIF
upload and prevented merge.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat(go): add `core go fuzz` command and wire into QA

- New `core go fuzz` command discovers Fuzz* targets and runs them
  with configurable --duration (default 10s per target)
- Fuzz added to default QA checks with 5s burst duration
- Seed fuzz targets for core package: FuzzE (error constructor),
  FuzzServiceRegistration, FuzzMessageDispatch

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* ci(codeql): add workflow_dispatch trigger for manual runs

Allows manual triggering of CodeQL when the automatic pull_request
trigger doesn't fire.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* ci(codeql): remove workflow in favour of default setup

CodeQL default setup is now enabled via repo settings for go and
python. The workflow-based approach uploaded results as "code quality"
rather than "code scanning", which didn't satisfy the code_scanning
ruleset requirement. Default setup handles this natively.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* ci(workflows): add explicit permissions to all workflows

- agent-verify: add issues: write (was missing, writes comments/labels)
- ci: add contents: read (explicit least-privilege)
- coverage: add contents: read (explicit least-privilege)

All workflows now declare permissions explicitly. Repo default is
read-only, so workflows without a block silently lacked write access.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* ci(workflows): replace inline logic with org reusable workflow callers

agent-verify.yml and auto-project.yml now delegate to centralised
reusable workflows in host-uk/.github, reducing per-repo duplication.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat(ci): auto-merge pipeline, org gate, and QA fix hints

Add auto-merge workflow for org member PRs, external PR gate with
label-based approval, and actionable fix instructions for QA failures.

- auto-merge.yml: enable squash auto-merge for org member PRs
- pr-gate.yml: org-gate check blocks external PRs without label
- cmd_qa.go: add FixHint field, fixHintFor(), extractFailingTest()
- Ruleset: thread resolution, stale review dismissal, 1min merge wait

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 14:13:32 +00:00