Commit graph

7 commits

Author SHA1 Message Date
Snider
9f4007c409 Remove StrictHostKeyChecking=no from SSH commands (#315)
* Remove StrictHostKeyChecking=no and implement proper host key verification

This commit addresses security concerns from the OWASP audit by enforcing
strict host key verification for all SSH and SCP commands.

Key changes:
- Replaced StrictHostKeyChecking=accept-new with yes in pkg/container and pkg/devops.
- Removed insecure host key verification from pkg/ansible SSH client.
- Implemented a synchronous host key discovery mechanism during VM boot
  using ssh-keyscan to populate ~/.core/known_hosts.
- Updated the devops Boot lifecycle to wait until the host key is verified.
- Ensured pkg/ansible correctly handles missing known_hosts files.
- Refactored hardcoded SSH port 2222 to a package constant DefaultSSHPort.
- Added CORE_SKIP_SSH_SCAN environment variable for test environments.

* Remove StrictHostKeyChecking=no and implement proper host key verification

Addresses security concerns from OWASP audit by enforcing strict host key
verification.

Changes:
- Replaced StrictHostKeyChecking=accept-new with yes in pkg/container and devops.
- Removed insecure host key verification from pkg/ansible.
- Added synchronous host key discovery using ssh-keyscan during VM boot.
- Updated Boot lifecycle to wait for host key verification.
- Handled missing known_hosts file in pkg/ansible.
- Refactored hardcoded SSH port to DefaultSSHPort constant.
- Fixed formatting issues identified by QA check.

* Secure SSH commands and fix auto-merge CI failure

Addresses OWASP security audit by enforcing strict host key verification
and fixes a CI failure in the auto-merge workflow.

Key changes:
- Replaced StrictHostKeyChecking=accept-new with yes in pkg/container and pkg/devops.
- Removed insecure host key verification from pkg/ansible.
- Implemented synchronous host key discovery using ssh-keyscan during VM boot.
- Handled missing known_hosts file in pkg/ansible.
- Refactored hardcoded SSH port to DefaultSSHPort constant.
- Added pkg/ansible/ssh_test.go to verify SSH client initialization.
- Fixed formatting in pkg/io/local/client.go.
- Fixed auto-merge.yml by inlining the script and providing repository context
  to 'gh' command, resolving the "not a git repository" error in CI.

* Secure SSH, fix CI auto-merge, and resolve merge conflicts

This commit addresses the OWASP security audit by enforcing strict host key
verification and resolves persistent CI issues.

Security Changes:
- Replaced StrictHostKeyChecking=accept-new with yes in pkg/container and devops.
- Removed insecure host key verification from pkg/ansible.
- Implemented synchronous host key discovery using ssh-keyscan during VM boot.
- Updated Boot lifecycle to wait for host key verification.
- Handled missing known_hosts file in pkg/ansible.
- Refactored hardcoded SSH port to DefaultSSHPort constant.

CI and Maintenance:
- Fixed auto-merge.yml by inlining the script and adding repository context
  to 'gh' command, resolving the "not a git repository" error in CI.
- Resolved merge conflicts in .github/workflows/auto-merge.yml with dev branch.
- Added pkg/ansible/ssh_test.go for SSH client verification.
- Fixed formatting in pkg/io/local/client.go to pass QA checks.

* Secure SSH and TLS connections, and fix CI issues

Addresses security concerns from OWASP audit and CodeQL by enforcing strict
host key verification and TLS certificate verification.

Security Changes:
- Enforced strict SSH host key checking in pkg/container and devops.
- Removed insecure SSH host key verification from pkg/ansible.
- Added synchronous host key discovery during VM boot using ssh-keyscan.
- Updated UniFi client to enforce TLS certificate verification by default.
- Added --insecure flag and config option for UniFi to allow opt-in to
  skipping TLS verification for self-signed certificates.

CI and Maintenance:
- Fixed auto-merge workflow by providing repository context to 'gh' command.
- Resolved merge conflicts in .github/workflows/auto-merge.yml.
- Added unit tests for secured Ansible SSH client.
- Fixed formatting issues identified by QA checks.

* fix: gofmt alignment in cmd_config.go

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Secure connections, fix CI auto-merge, and resolve formatting

Addresses OWASP security audit and CodeQL security alerts by enforcing
secure defaults for SSH and TLS connections.

Key changes:
- Enforced strict SSH host key checking (StrictHostKeyChecking=yes).
- Implemented synchronous host key verification during VM boot using ssh-keyscan.
- Updated UniFi client to enforce TLS certificate verification by default.
- Added --insecure flag and config option for UniFi to allow opt-in to
  skipping TLS verification.
- Fixed auto-merge workflow by providing repository context to 'gh' command.
- Resolved merge conflicts in .github/workflows/auto-merge.yml.
- Fixed formatting in internal/cmd/unifi/cmd_config.go and pkg/io/local/client.go.
- Added unit tests for secured Ansible SSH client.

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Claude <developers@lethean.io>
2026-02-05 07:09:34 +00:00
Snider
1d73209e89 Add Architecture Decision Records (ADRs) (#312)
* docs: add Architecture Decision Records (ADRs)

Established a system for documenting architectural decisions.
- Created docs/adr directory
- Added ADR template (0000-template.md)
- Established ADR process in docs/adr/README.md
- Documented 4 key existing decisions (0001-0004)
- Integrated ADRs into mkdocs.yml and docs/index.md

* docs: add Architecture Decision Records (ADRs)

Established a system for documenting architectural decisions.
- Created docs/adr directory
- Added ADR template (0000-template.md)
- Established ADR process in docs/adr/README.md
- Documented 4 key existing decisions (0001-0004)
- Integrated ADRs into mkdocs.yml and docs/index.md
- Fixed formatting in pkg/io/local/client.go

* docs: add ADRs and fix auto-merge CI

- Added Architecture Decision Records (ADRs) to docs/adr/
- Integrated ADRs into mkdocs.yml and docs/index.md
- Localized .github/workflows/auto-merge.yml to fix "fatal: not a git repository" error in the reusable workflow by adding explicit --repo context.
2026-02-05 03:43:16 +00:00
Snider
1dffdd3a1b fix(ci): gofmt base branch + auto-merge permissions (#323)
* style(io): fix gofmt formatting in local client

Remove extra blank line that causes QA fmt check to fail on all PRs.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(auto-merge): add required permissions to workflow caller

The thin caller was missing contents:write and pull-requests:write
permissions, causing startup_failure on every auto-merge run.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 18:38:41 +00:00
Snider
8f369000ad ci(workflows): replace inline pr-gate and auto-merge with org reusable callers (#303)
Moves the logic to host-uk/.github org-wide reusable workflows.
Fixes org-gate failure: uses author_association from webhook payload
instead of checkMembershipForUser (GITHUB_TOKEN lacks org scope).

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 17:51:00 +00:00
Snider
acec997d18 Migrate pkg/release to io.Medium abstraction (#290)
* chore(io): migrate pkg/release to io.Medium abstraction

Migrated `pkg/release` and its subpackages to use the `io.Medium` abstraction for filesystem operations. This enables better testability and support for alternative storage backends.

Changes:
- Added `FS io.Medium` field to `release.Release` and `publishers.Release` structs.
- Updated `LoadConfig`, `ConfigExists`, and `WriteConfig` in `pkg/release/config.go` to accept `io.Medium`.
- Updated `Publish`, `Run`, `findArtifacts`, and `buildArtifacts` in `pkg/release/release.go` to use `io.Medium`.
- Migrated all publishers (`aur`, `chocolatey`, `docker`, `github`, `homebrew`, `linuxkit`, `npm`, `scoop`) to use `io.Medium` for file operations.
- Implemented custom template overrides in publishers by checking for templates in `.core/templates/<publisher>/` via `io.Medium`.
- Updated all relevant tests to provide `io.Medium`.

* chore(io): fix missing callers in pkg/release migration

Updated callers of `release` package functions that had their signatures changed during the `io.Medium` migration.

Fixed files:
- `internal/cmd/ci/cmd_init.go`
- `internal/cmd/ci/cmd_publish.go`
- `pkg/build/buildcmd/cmd_release.go`

These changes ensure the project compiles successfully by providing `io.Local` to `LoadConfig`, `WriteConfig`, and `ConfigExists`.

* chore(io): fix build errors in pkg/release migration

Fixed compilation errors by updating all callers of `release.LoadConfig`, `release.ConfigExists`, and `release.WriteConfig` to provide the required `io.Medium` argument.

Files updated:
- `internal/cmd/ci/cmd_init.go`
- `internal/cmd/ci/cmd_publish.go`
- `pkg/build/buildcmd/cmd_release.go`

These entry points now correctly pass `io.Local` to the `release` package functions.
2026-02-04 15:07:13 +00:00
Snider
c54b28249c chore(io): Migrate pkg/cli to Medium abstraction (#285)
* chore(io): Migrate pkg/cli to Medium abstraction

- Update `PIDFile` struct to include `io.Medium` field.
- Update `NewPIDFile` signature to accept `io.Medium`.
- Update `PIDFile` methods to use injected medium instead of `io.Local`.
- Add `Medium` field to `DaemonOptions`.
- Update `NewDaemon` to default to `io.Local` if no medium is provided.
- Update `pkg/cli/daemon_test.go` to reflect changes and add mock medium tests.

* chore(io): Migrate pkg/cli to Medium abstraction

- Update `PIDFile` struct to include `io.Medium` field.
- Update `NewPIDFile` signature to accept `io.Medium`.
- Update `PIDFile` methods to use injected medium instead of `io.Local`.
- Add `Medium` field to `DaemonOptions`.
- Update `NewDaemon` to default to `io.Local` if no medium is provided.
- Update `pkg/cli/daemon_test.go` to reflect changes and add mock medium tests.
- Fix flaky test `TestLinuxKitManager_Stop_Good_ContextCancelled` by checking context at the start of `Stop`.
- Add fail-fast context checks to all `LinuxKitManager` methods taking a context.
2026-02-04 14:33:33 +00:00
Snider
90531c148d feat(ci): auto-merge pipeline, org gate, and QA fix hints (#284)
* refactor(core): decompose Core into serviceManager + messageBus (#215)

Extract two focused, unexported components from the Core "god object":

- serviceManager: owns service registry, lifecycle tracking (startables/
  stoppables), and service lock
- messageBus: owns IPC action dispatch, query handling, and task handling

All public API methods on Core become one-line delegation wrappers.
Zero consumer changes — no files outside pkg/framework/core/ modified.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(core): remove unused fields from test struct

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(core): address review feedback from Gemini and Copilot

- Move locked check inside mutex in registerService to fix TOCTOU race
- Add mutex guards to enableLock and applyLock methods
- Replace fmt.Errorf with errors.Join in action() for correct error
  aggregation (consistent with queryAll and lifecycle methods)
- Add TestMessageBus_Action_Bad for error aggregation coverage

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* ci(workflows): bump host-uk/build from v3 to v4

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* ci(workflows): replace Wails build with Go CLI build

The build action doesn't yet support Wails v3. Comment out the GUI
build step and use host-uk/build/actions/setup/go for Go toolchain
setup with a plain `go build` for the CLI binary.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(container): check context before select in Stop to fix flaky test

Stop() now checks ctx.Err() before entering the select block. When a
pre-cancelled context is passed, the select could non-deterministically
choose <-done over <-ctx.Done() if the process had already exited,
causing TestLinuxKitManager_Stop_Good_ContextCancelled to fail on CI.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(ci): trim CodeQL matrix to valid languages

Remove javascript-typescript and actions from CodeQL matrix — this
repo contains only Go and Python. Invalid languages blocked SARIF
upload and prevented merge.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat(go): add `core go fuzz` command and wire into QA

- New `core go fuzz` command discovers Fuzz* targets and runs them
  with configurable --duration (default 10s per target)
- Fuzz added to default QA checks with 5s burst duration
- Seed fuzz targets for core package: FuzzE (error constructor),
  FuzzServiceRegistration, FuzzMessageDispatch

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* ci(codeql): add workflow_dispatch trigger for manual runs

Allows manual triggering of CodeQL when the automatic pull_request
trigger doesn't fire.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* ci(codeql): remove workflow in favour of default setup

CodeQL default setup is now enabled via repo settings for go and
python. The workflow-based approach uploaded results as "code quality"
rather than "code scanning", which didn't satisfy the code_scanning
ruleset requirement. Default setup handles this natively.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* ci(workflows): add explicit permissions to all workflows

- agent-verify: add issues: write (was missing, writes comments/labels)
- ci: add contents: read (explicit least-privilege)
- coverage: add contents: read (explicit least-privilege)

All workflows now declare permissions explicitly. Repo default is
read-only, so workflows without a block silently lacked write access.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* ci(workflows): replace inline logic with org reusable workflow callers

agent-verify.yml and auto-project.yml now delegate to centralised
reusable workflows in host-uk/.github, reducing per-repo duplication.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat(ci): auto-merge pipeline, org gate, and QA fix hints

Add auto-merge workflow for org member PRs, external PR gate with
label-based approval, and actionable fix instructions for QA failures.

- auto-merge.yml: enable squash auto-merge for org member PRs
- pr-gate.yml: org-gate check blocks external PRs without label
- cmd_qa.go: add FixHint field, fixHintFor(), extractFailingTest()
- Ruleset: thread resolution, stale review dismissal, 1min merge wait

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 14:13:32 +00:00