22 KiB
22 KiB
Security Attack Vector Mapping Report
Execution plan in docs/plans/2026-03-24-security-attack-vector-mapping.md, results in docs/security/scan-attack-vector-mapping-report.md.
- Issue:
[scan] Security attack vector mapping - Repo:
core/go-scm - Branch:
agent/create-an-implementation-plan-for-this-s - Date:
2026-03-24 - Agent:
Codex
Findings Matrix
Status meanings:
open: validation gap or exploitable flow remains.validated: explicit hardening exists and was verified in code.mapped: flow is real, but the current risk is limited or mostly delegated to downstream tools.
file:line | input source | flows into | validation | attack vector | severity | confidence | verification
agentci/config.go:34 | config file `agentci.agents.*` (`host`, `queue_dir`, `forgejo_user`, models, runner) | `LoadAgents()` -> `Spinner.FindByForgejoUser()` -> `jobrunner/handlers.DispatchHandler` SSH target selection | only presence/default checks; queue dir is validated later, host is never allowlisted | SSH target manipulation / unauthorised agent routing if config is attacker-controlled | medium | med | mapped
cmd/collect/cmd.go:56 | CLI flag `--output` | `newConfig()` -> `collect.NewConfigWithMedium()` -> `cfg.OutputDir` and `.collect-state.json` | no canonicalisation or root policy | arbitrary local file write/read root for all collectors and processors | low | high | open
collect/bitcointalk.go:64 | positional `<topic-id>` via `cmd/collect/cmd_bitcointalk.go` | `BitcoinTalkCollector.Collect()` -> `filepath.Join(cfg.OutputDir, "bitcointalk", topicID, "posts")` -> `EnsureDir`/`Write` | only non-empty topic ID required | path traversal / arbitrary file write | high | high | open
collect/bitcointalk.go:89 | positional `<topic-id>` via `cmd/collect/cmd_bitcointalk.go` | `fmt.Sprintf("https://bitcointalk.org/index.php?topic=%s.%d", topicID, offset)` -> `http.NewRequestWithContext` | no query escaping or numeric check | malformed request / same-host request steering | medium | med | open
collect/github.go:123 | CLI `org/repo`, `--org`, or `excavate <project>` fallback org | `exec.CommandContext("gh", "repo", "list", g.Org, ...)` | no org format validation; `exec.CommandContext` blocks shell metachar expansion | scope expansion / excessive API enumeration; low injection surface | low | med | mapped
collect/github.go:179 | CLI `org/repo` and GitHub repo names from `gh repo list` | `filepath.Join(cfg.OutputDir, "github", g.Org, repo, "issues")` -> `EnsureDir`/`Write` | no path-segment validation on `g.Org` or `repo` | path traversal / arbitrary file write | high | high | open
collect/github.go:238 | CLI `org/repo` and GitHub repo names from `gh repo list` | `filepath.Join(cfg.OutputDir, "github", g.Org, repo, "pulls")` -> `EnsureDir`/`Write` | no path-segment validation on `g.Org` or `repo` | path traversal / arbitrary file write | high | high | open
collect/github.go:284 | GitHub API `issue.body` / `pr.body` | `formatIssueMarkdown()` -> markdown files under `cfg.Output` | no sanitisation before persistence | stored markdown/HTML/prompt injection in downstream renderers or LLM ingestion | medium | high | open
collect/market.go:84 | CLI `<coin>` via `cmd/collect/cmd_market.go` | `filepath.Join(cfg.OutputDir, "market", m.CoinID)` -> `EnsureDir` | only non-empty `CoinID` check | path traversal / arbitrary file write | high | high | open
collect/market.go:132 | CLI `<coin>` via `cmd/collect/cmd_market.go` | `fmt.Sprintf("%s/coins/%s", coinGeckoBaseURL, m.CoinID)` -> `fetchJSON()` | no path escaping or allowlist for `CoinID` | same-host endpoint injection / SSRF-style path steering | medium | high | open
collect/market.go:184 | CLI `<coin>` and `--from` | `fmt.Sprintf("%s/coins/%s/market_chart?vs_currency=usd&days=%s", ...)` -> `fetchJSON()` | `FromDate` parsing bounds `days`; `CoinID` remains unescaped | same-host endpoint injection / excessive upstream query shaping | medium | high | open
collect/papers.go:113 | CLI `--query` | `url.QueryEscape(p.Query)` -> IACR search request | query is escaped before request construction | query injection blocked; residual risk is upstream content trust only | low | high | validated
collect/papers.go:145 | IACR HTML `href` values parsed from remote search results | `parseIACREntry()` -> derived `ppr.ID` -> `filepath.Join(baseDir, ppr.ID+".md")` | no filename sanitisation on remote-derived ID | path traversal / arbitrary file write if upstream HTML is hostile or compromised | high | med | open
collect/papers.go:199 | CLI `--query` and `--category` | `url.QueryEscape(...)` -> arXiv API request, then `arxivEntryToPaper()` replaces `/` and `:` in IDs | query and category are escaped; arXiv IDs are partially canonicalised | request injection largely blocked; residual risk is remote content trust | low | high | validated
collect/process.go:50 | CLI `<dir>` via `cmd/collect/cmd_process.go` | `cfg.Output.List(p.Dir)` and `cfg.Output.Read(filepath.Join(p.Dir, name))` | no root containment or canonicalisation of `p.Dir` | arbitrary local read/list when using `io.Local` medium | high | high | open
collect/process.go:55 | CLI `<source>` via `cmd/collect/cmd_process.go` | `filepath.Join(cfg.OutputDir, "processed", p.Source)` -> `EnsureDir`/`Write` | no path validation on `p.Source` | path traversal / arbitrary file write | high | high | open
collect/process.go:214 | source HTML `href` attributes from files under `p.Dir` | `nodeToMarkdown()` -> `[text](href)` persisted to markdown | no scheme allowlist or URL sanitisation | stored `javascript:` / `data:` link injection | medium | high | open
collect/ratelimit.go:44 | concurrent caller-controlled source keys | `RateLimiter.Wait()` -> shared `last[source]` map | mutex released before slot reservation; no second eligibility check after sleep | burst bypass / API quota exhaustion / availability degradation | medium | high | open
forge/config.go:40 | config file `forge.url`, `forge.token`, env `FORGE_URL`, `FORGE_TOKEN`, CLI overrides | `ResolveConfig()` -> `forge.New()` -> Forgejo SDK client | token required, URL not allowlisted or canonicalised | SSRF / credential exfiltration to attacker-controlled Forgejo endpoint | high | high | open
cmd/forge/cmd_migrate.go:46 | CLI `<clone-url>`, `--service`, `--token`, `--org` | `runMigrate()` -> `forge.Client.MigrateRepo()` with `CloneAddr` and `AuthToken` | repo name is derived from URL tail only; clone URL itself is not constrained | server-side clone SSRF / internal repo import / credential misuse | high | high | open
cmd/forge/cmd_sync.go:95 | CLI repo args and `--base-path` | `syncRepoNameFromArg()` + `ResolvePathWithinRoot()` -> `git -C <localPath>` / `git push` | repo names validated via `ValidatePathElement`; local path confined to base path | path traversal mitigated; residual risk limited to untrusted Forge URL config | low | high | validated
gitea/config.go:40 | config file `gitea.url`, `gitea.token`, env `GITEA_URL`, `GITEA_TOKEN`, CLI overrides | `ResolveConfig()` -> `gitea.New()` -> Gitea SDK client | token required, URL not allowlisted or canonicalised | SSRF / credential exfiltration to attacker-controlled Gitea endpoint | high | high | open
cmd/gitea/cmd_mirror.go:45 | CLI `<github-owner/repo>` and optional GitHub token | `runMirror()` -> fixed `https://github.com/<owner>/<repo>.git` -> `gitea.CreateMirror()` | owner/repo are split, but not allowlisted; source token may be forwarded | server-side clone of attacker-chosen repo and token relay to remote Gitea | high | med | open
cmd/gitea/cmd_sync.go:95 | CLI repo args and `--base-path` | `repoNameFromArg()` + `ResolvePathWithinRoot()` -> `git -C <localPath>` / `git push` | repo names validated via `ValidatePathElement`; local path confined to base path | path traversal mitigated; residual risk limited to untrusted Gitea URL config | low | high | validated
jobrunner/forgejo/signals.go:16 | epic issue body checklist text | `parseEpicChildren()` -> child issue numbers -> `pollRepo()` dispatch decisions | regex only checks `#<digits>`; no repository or ownership cross-check | workflow spoofing / unintended child issue dispatch | medium | high | open
jobrunner/forgejo/signals.go:36 | PR body text | `findLinkedPR()` -> child issue association -> downstream handler execution | first `#<n>` match wins; no keyword, owner, or repo validation | PR-body spoofing causes wrong issue state transitions | medium | high | open
jobrunner/forgejo/source.go:72 | handler `result.Error` strings | `Report()` -> Forgejo issue comment code block | no sanitisation, truncation, or mention filtering | comment injection / mention spam / log forging of pipeline output | medium | high | open
jobrunner/handlers/dispatch.go:89 | agent config `queue_dir` plus `signal.RepoOwner` / `signal.RepoName` | `Execute()` -> remote queue path building and Forgejo label/comment operations | remote dir is validated; repo owner/name still pass through `SanitizePath(filepath.Base(...))` | retargeting/confusion if a forged signal supplies slash-containing names | medium | med | open
jobrunner/handlers/dispatch.go:151 | issue title/body, assignee, repo metadata from `PipelineSignal` | `DispatchTicket` JSON -> remote agent queue file | JSON encoding only; content is not filtered, signed, or policy-checked | prompt injection / remote agent control-surface abuse | high | high | open
jobrunner/handlers/dispatch.go:212 | local Forge token | `.env.<ticketID>` content -> remote worker filesystem | remote path validated and file mode set to `0600` after transfer | credential exposure on remote worker host, snapshots, or backups | medium | high | mapped
jobrunner/handlers/dispatch.go:255 | runtime `reason` strings and raw `signal.RepoOwner` / `signal.RepoName` | `failDispatch()` -> labels and issue comments on Forgejo | no sanitisation; fail path uses raw signal repo fields | comment injection and repo-retargeting if upstream signal is forged | medium | med | open
jobrunner/handlers/completion.go:58 | agent completion `signal.Message` / `signal.Error` | `CreateIssueComment()` on child issue | no sanitisation, truncation, or mention filtering | stored comment injection / mention spam / log forging | medium | high | open
jobrunner/journal.go:130 | `signal.RepoOwner` / `signal.RepoName` | `Append()` -> `<base>/<owner>/<repo>/<date>.jsonl` | strict component regex, absolute-prefix containment check, and mutexed writes | path traversal blocked; safe write pattern verified | low | high | validated
git/service.go:72 | concurrent query/task caller inputs `Paths` and service calls | `handleQuery()` / `handleTask()` -> shared `s.lastStatus` cache | no mutex or atomic protection around `lastStatus` | data race / stale or torn state under concurrent queries | medium | high | open
cmd/scm/cmd_compile.go:54 | CLI `--sign-key` hex string | `hex.DecodeString()` -> `manifest.Compile()` -> `manifest.Sign()` -> `ed25519.Sign()` | hex decoding only; no private key length check | panic-triggered DoS via invalid Ed25519 private key length | high | high | open
cmd/scm/cmd_export.go:28 | CLI `--dir` | `io.NewSandboxed(dir)` -> `manifest.LoadCompiled()` / `manifest.Load()` -> stdout | sandboxed medium confines access to the chosen root | local file disclosure limited to caller-selected directory | low | high | mapped
cmd/scm/cmd_index.go:39 | CLI `--dir`, `--output`, `--base-url`, `--org` | `marketplace.Builder.BuildFromDirs()` / `marketplace.WriteIndex()` | no path or URL validation beyond OS errors | arbitrary local read/write and untrusted repo URL generation in produced index | medium | high | open
pkg/api/provider.go:202 | HTTP query params `q` and `category` | `listMarketplace()` -> in-memory `Index.Search()` / `ByCategory()` | no server-side escaping needed; no filesystem or outbound request sink | low-risk search-scope manipulation only | low | high | mapped
pkg/api/provider.go:466 | HTTP path param `:code` | `marketplaceCodeParam()` -> install/update/remove lookups | `PathUnescape()` + `ValidatePathElement()` | path traversal blocked for marketplace code paths | low | high | validated
pkg/api/provider.go:309 | HTTP JSON body `public_key` | `hex.DecodeString()` -> `manifest.Verify()` -> `ed25519.Verify()` | hex decoding only; no public key length check | panic-triggered DoS via malformed Ed25519 public key | high | high | open
pkg/api/provider.go:346 | HTTP JSON body `private_key` | `hex.DecodeString()` -> `manifest.Sign()` -> `ed25519.Sign()` | hex decoding only; no private key length check | panic-triggered DoS via malformed Ed25519 private key | high | high | open
marketplace/installer.go:50 | marketplace index `Module.Code` and `Module.Repo` | `resolveModulePath()` -> `git clone <repo> <dest>` -> manifest load/store | module code is now confined with `ResolvePathWithinRoot`; repo URL is unvalidated | arbitrary git transport / SSRF / local-path clone from untrusted marketplace index | high | high | open
marketplace/installer.go:195 | marketplace index `Module.SignKey` | `hex.DecodeString()` -> `manifest.LoadVerified()` -> `ed25519.Verify()` | hex decoding only; no public key length check | panic-triggered DoS during install/update from malformed marketplace metadata | high | high | open
plugin/installer.go:181 | plugin source `org/repo@version` | `ParseSource()` -> registry name/path selection -> clone arguments | org and repo validated as path elements; version only checked for non-empty string | branch/ref abuse and unexpected checkout target selection | medium | med | mapped
plugin/installer.go:165 | plugin `@version` suffix | `gh repo clone <org/repo> <dest> -- --branch <version>` | `exec.CommandContext` blocks shell injection; no ref allowlist | malicious ref names can force unexpected refs/tags or failure loops | medium | med | open
plugin/loader.go:51 | caller-supplied plugin name | `filepath.Join(baseDir, name, "plugin.json")` -> `LoadManifest()` | no validation on `name` | path traversal / arbitrary local file read | high | high | open
repos/registry.go:66 | caller-supplied registry path | `io.Medium.Read(path)` -> YAML parse | no path validation | arbitrary local file read via caller-chosen registry path | medium | high | open
repos/registry.go:81 | YAML `base_path` and per-repo `path` values | expanded `Repo.Path` -> later `Exists()`, `IsGitRepo()`, git config probing | tilde expansion only; no root confinement | workspace escape / probing of arbitrary directories from untrusted registry | medium | high | open
repos/kbconfig.go:69 | caller-supplied workspace root | `.core/kb.yaml` read/write | no canonicalisation beyond join | arbitrary local read/write under caller-selected root | low | high | mapped
repos/kbconfig.go:110 | KB config `Wiki.Remote`, `Wiki.Dir`, and caller-supplied `repoName` | `WikiRepoURL()` and `WikiLocalPath()` | no validation of remote URL, local subdir, or repo name | path traversal in local wiki path and clone redirection to attacker-chosen remote | medium | high | open
Deduplicated Attack Vectors
highPath traversal and arbitrary local read/write from unsanitised path joins incollect/,plugin/loader.go,repos/registry.go, andrepos/kbconfig.go.highSSRF, server-side clone, and credential exfiltration via unvalidated Forgejo/Gitea base URLs, migration clone URLs, and marketplace repo URLs.highPanic-triggered availability loss from unchecked Ed25519 key lengths in CLI and HTTP signing/verification flows.highPrompt injection and remote worker influence via raw issue title/body content written into dispatch tickets.mediumStored markdown/comment injection via GitHub bodies, processed HTML links, jobrunner error comments, and completion messages.mediumWorkflow spoofing from weak issue/PR text parsing injobrunner/forgejo/signals.go.mediumBranch/ref steering and unexpected checkout behaviour in plugin installation.mediumConcurrency races incollect/ratelimit.goandgit/service.go.lowBroad CLI-selected filesystem scope incmd/collect,cmd/scm export, andrepos/kbconfig.goremains expected but should not be delegated to lower-trust callers.lowVerified mitigations exist for journal path safety, marketplace code path validation, and sync local-path confinement.
Verification Status
- Total mapped flows:
50 open:37validated:6mapped:7- High severity:
18 - Medium severity:
20 - Low severity:
12
Subsystem Summary
P1 collect/: the highest concentration of exploitable issues is in collector output path construction and persistence of raw third-party content.collect/github.go,collect/market.go,collect/bitcointalk.go,collect/papers.go, andcollect/process.goall need path and content hardening.P1 forge/gitea/marketplace/pkg/api: base URL trust, migration/mirror clone sources, and Ed25519 key handling are the main confidentiality and availability risks. These are central control-plane paths and should be fixed before widening API exposure.P2 jobrunner/: epic/PR text parsing, remote dispatch ticket generation, and comment reporting create workflow spoofing and prompt-injection surfaces. The journal implementation is the notable exception and is already hardened.P2 workspace/plugin/repos/git: plugin loading, registry path handling, wiki path generation, and shared git-service state carry medium-risk filesystem and race-condition exposure.P3 validated controls:jobrunner/journal.go,cmd/forge/cmd_sync.go,cmd/gitea/cmd_sync.go, andpkg/api/provider.gomarketplace code normalisation are solid reference implementations for future hardening work.
High-Priority Follow-Up
- Replace every collector and loader
filepath.Join(..., userValue, ...)call with strict path-element validation before directory creation or file reads/writes. - Add scheme, host, and canonicalisation allowlists for Forgejo/Gitea base URLs, migration clone URLs, marketplace repo URLs, and wiki remotes.
- Reject invalid Ed25519 key lengths before calling
ed25519.Signored25519.Verifyin CLI, HTTP, and installer code paths. - Treat dispatch tickets as untrusted agent prompts: strip or isolate raw issue bodies, add provenance metadata, and avoid shipping long-lived Forge tokens to remote workers where possible.
- Harden
jobrunner/forgejolinking logic so epic children and PR references must match explicit repo/issue relationships, not bare#123text. - Sanitize or encode user-visible comments and stored markdown links before writing them to Forgejo or output files.
- Fix
collect.RateLimiter.Waitandgit.Service.lastStatuswith proper reservation and synchronisation.
Scan Coverage
The full scan set from the issue plan was reviewed in package order. Files below either map directly to the matrix above or are forward-only wrappers/helpers with no additional unique external entry point beyond the flows already captured.
agentci:clotho.go(strategy helper only),config.go(matrix),security.go(validation utilities only).cmd/collect:cmd.go(matrix),cmd_bitcointalk.go,cmd_excavate.go,cmd_github.go,cmd_market.go,cmd_papers.go,cmd_process.go(all covered bycollect/*rows),cmd_dispatch.go(local event enum validation only; no command execution sink).cmd/forge:cmd_auth.goandcmd_config.gofeed theforge/config.gorow;cmd_migrate.goandcmd_sync.goare in the matrix;cmd_forge.go,cmd_issues.go,cmd_labels.go,cmd_orgs.go,cmd_prs.go,cmd_repos.go,cmd_status.go,helpers.goare CLI/API wrappers only.cmd/gitea:cmd_config.gofeeds thegitea/config.gorow;cmd_mirror.goandcmd_sync.goare in the matrix;cmd_gitea.go,cmd_issues.go,cmd_prs.go,cmd_repos.goare CLI/API wrappers only.cmd/scm:cmd_compile.go,cmd_export.go,cmd_index.goare in the matrix;cmd_scm.gois a root command wrapper only.collect:bitcointalk.go,collect.go,github.go,market.go,papers.go,process.go,ratelimit.goare in the matrix;events.goandexcavate.goadd no new sink beyond already-mapped config and collector flows;state.gois subsumed by the shared--outputpath row.forge:client.go,config.go,repos.goare in the matrix;issues.go,labels.go,meta.go,orgs.go,prs.go,webhooks.goforward typed inputs into the SDK without additional local command/path construction.git:service.gois in the matrix;git.gois command-wrapper logic without additional external entry surface beyond caller-supplied repo paths.gitea:client.go,config.go,repos.goare in the matrix;issues.goandmeta.goare SDK wrappers only.jobrunner/forgejo:signals.goandsource.goare in the matrix.jobrunner/handlers:completion.goanddispatch.goare in the matrix;enable_auto_merge.go,publish_draft.go,resolve_threads.go,send_fix_command.go, andtick_parent.goconsume already-mappedPipelineSignalfields and emit fixed API calls/messages only.jobrunner:journal.gois in the matrix;poller.goandtypes.godefine orchestration and data structures without introducing new external input parsing.locales:embed.gocontains no runtime external input handling.manifest:compile.goandsign.goare covered throughcmd/scmandpkg/api/provider.go;loader.goandmanifest.goparse fixed project-local files only.marketplace:builder.goandinstaller.goare in the matrix;discovery.goandmarketplace.goare read-only helpers around already-mapped local filesystem or in-memory index flows.pkg/api:provider.gois in the matrix;embed.gocontains no external input handling.plugin:installer.go,loader.go, andregistry.goare in the matrix;config.go,manifest.go, andplugin.godefine data structures and validation helpers only.repos:kbconfig.goandregistry.goare in the matrix;gitstate.goandworkconfig.goare config/struct helpers without new sink construction.