[scan] Security attack vector mapping #7

New issue

Open

opened 2026-03-23 12:54:06 +00:00 by Virgil · 0 comments

Virgil commented

2026-03-23 12:54:06 +00:00

Member

Map every external input entry point: function, file:line, input source, flows into, validation, attack vector.

Implementation Plan

[scan] Security attack vector mapping — Implementation Plan

For Codex agent: this document is the issue body to execute next.

Goal

Map every external input entry point across core/go-scm and produce a complete attack-vector matrix with evidence and validation status.

Scope

Repository: core/go-scm
Language: Go
Branch: agent/create-an-implementation-plan-for-this-s
Target files: all non-test Go source files

1) Every file to scan

Use this exact list as the scan set.

agentci/clotho.go
agentci/config.go
agentci/security.go
cmd/collect/cmd.go
cmd/collect/cmd_bitcointalk.go
cmd/collect/cmd_dispatch.go
cmd/collect/cmd_excavate.go
cmd/collect/cmd_github.go
cmd/collect/cmd_market.go
cmd/collect/cmd_papers.go
cmd/collect/cmd_process.go
cmd/forge/cmd_auth.go
cmd/forge/cmd_config.go
cmd/forge/cmd_forge.go
cmd/forge/cmd_issues.go
cmd/forge/cmd_labels.go
cmd/forge/cmd_migrate.go
cmd/forge/cmd_orgs.go
cmd/forge/cmd_prs.go
cmd/forge/cmd_repos.go
cmd/forge/cmd_status.go
cmd/forge/cmd_sync.go
cmd/forge/helpers.go
cmd/gitea/cmd_config.go
cmd/gitea/cmd_gitea.go
cmd/gitea/cmd_issues.go
cmd/gitea/cmd_mirror.go
cmd/gitea/cmd_prs.go
cmd/gitea/cmd_repos.go
cmd/gitea/cmd_sync.go
cmd/scm/cmd_compile.go
cmd/scm/cmd_export.go
cmd/scm/cmd_index.go
cmd/scm/cmd_scm.go
collect/bitcointalk.go
collect/collect.go
collect/events.go
collect/excavate.go
collect/github.go
collect/market.go
collect/papers.go
collect/process.go
collect/ratelimit.go
collect/state.go
forge/client.go
forge/config.go
forge/issues.go
forge/labels.go
forge/meta.go
forge/orgs.go
forge/prs.go
forge/repos.go
forge/webhooks.go
git/git.go
git/service.go
gitea/client.go
gitea/config.go
gitea/issues.go
gitea/meta.go
gitea/repos.go
jobrunner/forgejo/signals.go
jobrunner/forgejo/source.go
jobrunner/handlers/completion.go
jobrunner/handlers/dispatch.go
jobrunner/handlers/enable_auto_merge.go
jobrunner/handlers/publish_draft.go
jobrunner/handlers/resolve_threads.go
jobrunner/handlers/send_fix_command.go
jobrunner/handlers/tick_parent.go
jobrunner/journal.go
jobrunner/poller.go
jobrunner/types.go
locales/embed.go
manifest/compile.go
manifest/loader.go
manifest/manifest.go
manifest/sign.go
marketplace/builder.go
marketplace/discovery.go
marketplace/installer.go
marketplace/marketplace.go
pkg/api/embed.go
pkg/api/provider.go
plugin/config.go
plugin/installer.go
plugin/loader.go
plugin/manifest.go
plugin/plugin.go
plugin/registry.go
repos/gitstate.go
repos/kbconfig.go
repos/registry.go
repos/workconfig.go

2) What to look for in each file

A) External input entry points

CLI entry arguments and flags
CLI env-config precedence (config, FORGE_*, GITEA_*)
HTTP request handlers, webhook payloads, request headers, params, and body
Filesystem paths and filenames supplied as config/flags/env/user values
External URLs, repo names, owner/org names, branch names
SSH command inputs and key material
Collector source inputs (search terms, query terms, package names, IDs)
Job signals/labels/checklists/issue fields/command payloads

B) Validation gaps to record

Missing/weak allowlists for path segments and IDs
Missing canonicalisation of URLs/paths/refs before use
Incomplete enum/type validation (state, status, action)
Unsafe conversions/casts without bounds checks
Insufficient escaping for shell, file, JSON, or command composition
Trusting external API responses without schema checks
Ignoring error paths that should halt processing

C) Injection vectors

OS command construction
Path traversal / arbitrary file write/read
Log forging / token leak via logging
SSRF via configurable URLs and webhooks
Template/HTML injection through rendered output paths
SQL-like or LDAP-like interpolation (where backend uses query strings)
Git command argument injection and branch/ref injection
Header injection in HTTP clients/servers

D) Race condition risks

Shared mutable state in handlers and services
Journal writes/read/write-back without coordination
Async polling loops sharing token/counter/state
Map/slice writes from worker goroutines
TOCTOU around file presence/load/read-modify-write sequences
Cache + state refresh races under concurrent polling

3) Required output format for findings

Each finding row must contain all columns in this exact order:

file:line | input source | flows into | validation | attack vector

Example format:

collect/github.go:142 | cmd flag --org (string) | buildGitHubCollector(config) -> net/http request URL | domain allowlist absent | SSRF + data exfil by domain override

Minimal per-row capture fields

file:line (primary function where input first enters)
input source (flag/env/body/path/header/id)
flows into (target function/call chain)
validation (what checks currently exist, if any)
attack vector (confidentiality/integrity/availability risk)

Evidence fields (optional but preferred)

CWE id
Repro path
severity (low/medium/high)
confidence (high/med/low)
mitigation candidate

4) Where to write the report

Primary report file to produce:

docs/security/scan-attack-vector-mapping-report.md

This file must contain:

The completed matrix in the required format above
Deduplicated list of attack vectors with severity
Verification status (mapped, validated, open
)
A final summary by subsystem and risk priority

This issue body should be replaced with a pointer plus a short runbook:

"Execution plan in docs/plans/2026-03-24-security-attack-vector-mapping.md, results in docs/security/scan-attack-vector-mapping-report.md."

Execution order

1. Create report file and add header + schema columns
2. Scan files in package order listed above
3. For each file, capture every external input entry point and map to sinks
4. Populate one row per mapped flow in the required file:line | ... format
5. Cross-check for duplicates and deduplicate by identical sink and attack vector
6. Add severity and validation gap notes per row
7. Finalise summary + high-priority follow-up list

Map every external input entry point: function, file:line, input source, flows into, validation, attack vector. --- ## Implementation Plan # [scan] Security attack vector mapping — Implementation Plan > For Codex agent: this document is the issue body to execute next. ## Goal Map every external input entry point across `core/go-scm` and produce a complete attack-vector matrix with evidence and validation status. ## Scope - Repository: `core/go-scm` - Language: Go - Branch: `agent/create-an-implementation-plan-for-this-s` - Target files: all non-test Go source files ## 1) Every file to scan Use this exact list as the scan set. - `agentci/clotho.go` - `agentci/config.go` - `agentci/security.go` - `cmd/collect/cmd.go` - `cmd/collect/cmd_bitcointalk.go` - `cmd/collect/cmd_dispatch.go` - `cmd/collect/cmd_excavate.go` - `cmd/collect/cmd_github.go` - `cmd/collect/cmd_market.go` - `cmd/collect/cmd_papers.go` - `cmd/collect/cmd_process.go` - `cmd/forge/cmd_auth.go` - `cmd/forge/cmd_config.go` - `cmd/forge/cmd_forge.go` - `cmd/forge/cmd_issues.go` - `cmd/forge/cmd_labels.go` - `cmd/forge/cmd_migrate.go` - `cmd/forge/cmd_orgs.go` - `cmd/forge/cmd_prs.go` - `cmd/forge/cmd_repos.go` - `cmd/forge/cmd_status.go` - `cmd/forge/cmd_sync.go` - `cmd/forge/helpers.go` - `cmd/gitea/cmd_config.go` - `cmd/gitea/cmd_gitea.go` - `cmd/gitea/cmd_issues.go` - `cmd/gitea/cmd_mirror.go` - `cmd/gitea/cmd_prs.go` - `cmd/gitea/cmd_repos.go` - `cmd/gitea/cmd_sync.go` - `cmd/scm/cmd_compile.go` - `cmd/scm/cmd_export.go` - `cmd/scm/cmd_index.go` - `cmd/scm/cmd_scm.go` - `collect/bitcointalk.go` - `collect/collect.go` - `collect/events.go` - `collect/excavate.go` - `collect/github.go` - `collect/market.go` - `collect/papers.go` - `collect/process.go` - `collect/ratelimit.go` - `collect/state.go` - `forge/client.go` - `forge/config.go` - `forge/issues.go` - `forge/labels.go` - `forge/meta.go` - `forge/orgs.go` - `forge/prs.go` - `forge/repos.go` - `forge/webhooks.go` - `git/git.go` - `git/service.go` - `gitea/client.go` - `gitea/config.go` - `gitea/issues.go` - `gitea/meta.go` - `gitea/repos.go` - `jobrunner/forgejo/signals.go` - `jobrunner/forgejo/source.go` - `jobrunner/handlers/completion.go` - `jobrunner/handlers/dispatch.go` - `jobrunner/handlers/enable_auto_merge.go` - `jobrunner/handlers/publish_draft.go` - `jobrunner/handlers/resolve_threads.go` - `jobrunner/handlers/send_fix_command.go` - `jobrunner/handlers/tick_parent.go` - `jobrunner/journal.go` - `jobrunner/poller.go` - `jobrunner/types.go` - `locales/embed.go` - `manifest/compile.go` - `manifest/loader.go` - `manifest/manifest.go` - `manifest/sign.go` - `marketplace/builder.go` - `marketplace/discovery.go` - `marketplace/installer.go` - `marketplace/marketplace.go` - `pkg/api/embed.go` - `pkg/api/provider.go` - `plugin/config.go` - `plugin/installer.go` - `plugin/loader.go` - `plugin/manifest.go` - `plugin/plugin.go` - `plugin/registry.go` - `repos/gitstate.go` - `repos/kbconfig.go` - `repos/registry.go` - `repos/workconfig.go` ## 2) What to look for in each file ### A) External input entry points - CLI entry arguments and flags - CLI env-config precedence (`config`, `FORGE_*`, `GITEA_*`) - HTTP request handlers, webhook payloads, request headers, params, and body - Filesystem paths and filenames supplied as config/flags/env/user values - External URLs, repo names, owner/org names, branch names - SSH command inputs and key material - Collector source inputs (search terms, query terms, package names, IDs) - Job signals/labels/checklists/issue fields/command payloads ### B) Validation gaps to record - Missing/weak allowlists for path segments and IDs - Missing canonicalisation of URLs/paths/refs before use - Incomplete enum/type validation (state, status, action) - Unsafe conversions/casts without bounds checks - Insufficient escaping for shell, file, JSON, or command composition - Trusting external API responses without schema checks - Ignoring error paths that should halt processing ### C) Injection vectors - OS command construction - Path traversal / arbitrary file write/read - Log forging / token leak via logging - SSRF via configurable URLs and webhooks - Template/HTML injection through rendered output paths - SQL-like or LDAP-like interpolation (where backend uses query strings) - Git command argument injection and branch/ref injection - Header injection in HTTP clients/servers ### D) Race condition risks - Shared mutable state in handlers and services - Journal writes/read/write-back without coordination - Async polling loops sharing token/counter/state - Map/slice writes from worker goroutines - TOCTOU around file presence/load/read-modify-write sequences - Cache + state refresh races under concurrent polling ## 3) Required output format for findings Each finding row must contain all columns in this exact order: `file:line | input source | flows into | validation | attack vector` Example format: `collect/github.go:142 | cmd flag `--org` (string) | buildGitHubCollector(config) -> net/http request URL | domain allowlist absent | SSRF + data exfil by domain override` ### Minimal per-row capture fields - `file:line` (primary function where input first enters) - `input source` (flag/env/body/path/header/id) - `flows into` (target function/call chain) - `validation` (what checks currently exist, if any) - `attack vector` (confidentiality/integrity/availability risk) ### Evidence fields (optional but preferred) - CWE id - Repro path - severity (`low`/`medium`/`high`) - confidence (`high`/`med`/`low`) - mitigation candidate ## 4) Where to write the report Primary report file to produce: - `docs/security/scan-attack-vector-mapping-report.md` This file must contain: 1. The completed matrix in the required format above 2. Deduplicated list of attack vectors with severity 3. Verification status (`mapped`, `validated`, `open` ) 4. A final summary by subsystem and risk priority This issue body should be replaced with a pointer plus a short runbook: - "Execution plan in `docs/plans/2026-03-24-security-attack-vector-mapping.md`, results in `docs/security/scan-attack-vector-mapping-report.md`." ## Execution order - [ ] 1. Create report file and add header + schema columns - [ ] 2. Scan files in package order listed above - [ ] 3. For each file, capture every external input entry point and map to sinks - [ ] 4. Populate one row per mapped flow in the required `file:line | ...` format - [ ] 5. Cross-check for duplicates and deduplicate by identical sink and attack vector - [ ] 6. Add severity and validation gap notes per row - [ ] 7. Finalise summary + high-priority follow-up list

Virgil referenced this issue

2026-03-24 13:18:45 +00:00

[agent/codex] Execute the security attack vector mapping scan described in... #13