[scan] Security attack vector mapping #7

Open
opened 2026-03-23 12:54:06 +00:00 by Virgil · 0 comments
Member

Map every external input entry point: function, file:line, input source, flows into, validation, attack vector.


Implementation Plan

[scan] Security attack vector mapping — Implementation Plan

For Codex agent: this document is the issue body to execute next.

Goal

Map every external input entry point across core/go-scm and produce a complete attack-vector matrix with evidence and validation status.

Scope

  • Repository: core/go-scm
  • Language: Go
  • Branch: agent/create-an-implementation-plan-for-this-s
  • Target files: all non-test Go source files

1) Every file to scan

Use this exact list as the scan set.

  • agentci/clotho.go
  • agentci/config.go
  • agentci/security.go
  • cmd/collect/cmd.go
  • cmd/collect/cmd_bitcointalk.go
  • cmd/collect/cmd_dispatch.go
  • cmd/collect/cmd_excavate.go
  • cmd/collect/cmd_github.go
  • cmd/collect/cmd_market.go
  • cmd/collect/cmd_papers.go
  • cmd/collect/cmd_process.go
  • cmd/forge/cmd_auth.go
  • cmd/forge/cmd_config.go
  • cmd/forge/cmd_forge.go
  • cmd/forge/cmd_issues.go
  • cmd/forge/cmd_labels.go
  • cmd/forge/cmd_migrate.go
  • cmd/forge/cmd_orgs.go
  • cmd/forge/cmd_prs.go
  • cmd/forge/cmd_repos.go
  • cmd/forge/cmd_status.go
  • cmd/forge/cmd_sync.go
  • cmd/forge/helpers.go
  • cmd/gitea/cmd_config.go
  • cmd/gitea/cmd_gitea.go
  • cmd/gitea/cmd_issues.go
  • cmd/gitea/cmd_mirror.go
  • cmd/gitea/cmd_prs.go
  • cmd/gitea/cmd_repos.go
  • cmd/gitea/cmd_sync.go
  • cmd/scm/cmd_compile.go
  • cmd/scm/cmd_export.go
  • cmd/scm/cmd_index.go
  • cmd/scm/cmd_scm.go
  • collect/bitcointalk.go
  • collect/collect.go
  • collect/events.go
  • collect/excavate.go
  • collect/github.go
  • collect/market.go
  • collect/papers.go
  • collect/process.go
  • collect/ratelimit.go
  • collect/state.go
  • forge/client.go
  • forge/config.go
  • forge/issues.go
  • forge/labels.go
  • forge/meta.go
  • forge/orgs.go
  • forge/prs.go
  • forge/repos.go
  • forge/webhooks.go
  • git/git.go
  • git/service.go
  • gitea/client.go
  • gitea/config.go
  • gitea/issues.go
  • gitea/meta.go
  • gitea/repos.go
  • jobrunner/forgejo/signals.go
  • jobrunner/forgejo/source.go
  • jobrunner/handlers/completion.go
  • jobrunner/handlers/dispatch.go
  • jobrunner/handlers/enable_auto_merge.go
  • jobrunner/handlers/publish_draft.go
  • jobrunner/handlers/resolve_threads.go
  • jobrunner/handlers/send_fix_command.go
  • jobrunner/handlers/tick_parent.go
  • jobrunner/journal.go
  • jobrunner/poller.go
  • jobrunner/types.go
  • locales/embed.go
  • manifest/compile.go
  • manifest/loader.go
  • manifest/manifest.go
  • manifest/sign.go
  • marketplace/builder.go
  • marketplace/discovery.go
  • marketplace/installer.go
  • marketplace/marketplace.go
  • pkg/api/embed.go
  • pkg/api/provider.go
  • plugin/config.go
  • plugin/installer.go
  • plugin/loader.go
  • plugin/manifest.go
  • plugin/plugin.go
  • plugin/registry.go
  • repos/gitstate.go
  • repos/kbconfig.go
  • repos/registry.go
  • repos/workconfig.go

2) What to look for in each file

A) External input entry points

  • CLI entry arguments and flags
  • CLI env-config precedence (config, FORGE_*, GITEA_*)
  • HTTP request handlers, webhook payloads, request headers, params, and body
  • Filesystem paths and filenames supplied as config/flags/env/user values
  • External URLs, repo names, owner/org names, branch names
  • SSH command inputs and key material
  • Collector source inputs (search terms, query terms, package names, IDs)
  • Job signals/labels/checklists/issue fields/command payloads

B) Validation gaps to record

  • Missing/weak allowlists for path segments and IDs
  • Missing canonicalisation of URLs/paths/refs before use
  • Incomplete enum/type validation (state, status, action)
  • Unsafe conversions/casts without bounds checks
  • Insufficient escaping for shell, file, JSON, or command composition
  • Trusting external API responses without schema checks
  • Ignoring error paths that should halt processing

C) Injection vectors

  • OS command construction
  • Path traversal / arbitrary file write/read
  • Log forging / token leak via logging
  • SSRF via configurable URLs and webhooks
  • Template/HTML injection through rendered output paths
  • SQL-like or LDAP-like interpolation (where backend uses query strings)
  • Git command argument injection and branch/ref injection
  • Header injection in HTTP clients/servers

D) Race condition risks

  • Shared mutable state in handlers and services
  • Journal writes/read/write-back without coordination
  • Async polling loops sharing token/counter/state
  • Map/slice writes from worker goroutines
  • TOCTOU around file presence/load/read-modify-write sequences
  • Cache + state refresh races under concurrent polling

3) Required output format for findings

Each finding row must contain all columns in this exact order:

file:line | input source | flows into | validation | attack vector

Example format:

collect/github.go:142 | cmd flag --org (string) | buildGitHubCollector(config) -> net/http request URL | domain allowlist absent | SSRF + data exfil by domain override

Minimal per-row capture fields

  • file:line (primary function where input first enters)
  • input source (flag/env/body/path/header/id)
  • flows into (target function/call chain)
  • validation (what checks currently exist, if any)
  • attack vector (confidentiality/integrity/availability risk)

Evidence fields (optional but preferred)

  • CWE id
  • Repro path
  • severity (low/medium/high)
  • confidence (high/med/low)
  • mitigation candidate

4) Where to write the report

Primary report file to produce:

  • docs/security/scan-attack-vector-mapping-report.md

This file must contain:

  1. The completed matrix in the required format above
  2. Deduplicated list of attack vectors with severity
  3. Verification status (mapped, validated, open
    )
  4. A final summary by subsystem and risk priority

This issue body should be replaced with a pointer plus a short runbook:

  • "Execution plan in docs/plans/2026-03-24-security-attack-vector-mapping.md, results in docs/security/scan-attack-vector-mapping-report.md."

Execution order

  • 1. Create report file and add header + schema columns
  • 2. Scan files in package order listed above
  • 3. For each file, capture every external input entry point and map to sinks
  • 4. Populate one row per mapped flow in the required file:line | ... format
  • 5. Cross-check for duplicates and deduplicate by identical sink and attack vector
  • 6. Add severity and validation gap notes per row
  • 7. Finalise summary + high-priority follow-up list
Map every external input entry point: function, file:line, input source, flows into, validation, attack vector. --- ## Implementation Plan # [scan] Security attack vector mapping — Implementation Plan > For Codex agent: this document is the issue body to execute next. ## Goal Map every external input entry point across `core/go-scm` and produce a complete attack-vector matrix with evidence and validation status. ## Scope - Repository: `core/go-scm` - Language: Go - Branch: `agent/create-an-implementation-plan-for-this-s` - Target files: all non-test Go source files ## 1) Every file to scan Use this exact list as the scan set. - `agentci/clotho.go` - `agentci/config.go` - `agentci/security.go` - `cmd/collect/cmd.go` - `cmd/collect/cmd_bitcointalk.go` - `cmd/collect/cmd_dispatch.go` - `cmd/collect/cmd_excavate.go` - `cmd/collect/cmd_github.go` - `cmd/collect/cmd_market.go` - `cmd/collect/cmd_papers.go` - `cmd/collect/cmd_process.go` - `cmd/forge/cmd_auth.go` - `cmd/forge/cmd_config.go` - `cmd/forge/cmd_forge.go` - `cmd/forge/cmd_issues.go` - `cmd/forge/cmd_labels.go` - `cmd/forge/cmd_migrate.go` - `cmd/forge/cmd_orgs.go` - `cmd/forge/cmd_prs.go` - `cmd/forge/cmd_repos.go` - `cmd/forge/cmd_status.go` - `cmd/forge/cmd_sync.go` - `cmd/forge/helpers.go` - `cmd/gitea/cmd_config.go` - `cmd/gitea/cmd_gitea.go` - `cmd/gitea/cmd_issues.go` - `cmd/gitea/cmd_mirror.go` - `cmd/gitea/cmd_prs.go` - `cmd/gitea/cmd_repos.go` - `cmd/gitea/cmd_sync.go` - `cmd/scm/cmd_compile.go` - `cmd/scm/cmd_export.go` - `cmd/scm/cmd_index.go` - `cmd/scm/cmd_scm.go` - `collect/bitcointalk.go` - `collect/collect.go` - `collect/events.go` - `collect/excavate.go` - `collect/github.go` - `collect/market.go` - `collect/papers.go` - `collect/process.go` - `collect/ratelimit.go` - `collect/state.go` - `forge/client.go` - `forge/config.go` - `forge/issues.go` - `forge/labels.go` - `forge/meta.go` - `forge/orgs.go` - `forge/prs.go` - `forge/repos.go` - `forge/webhooks.go` - `git/git.go` - `git/service.go` - `gitea/client.go` - `gitea/config.go` - `gitea/issues.go` - `gitea/meta.go` - `gitea/repos.go` - `jobrunner/forgejo/signals.go` - `jobrunner/forgejo/source.go` - `jobrunner/handlers/completion.go` - `jobrunner/handlers/dispatch.go` - `jobrunner/handlers/enable_auto_merge.go` - `jobrunner/handlers/publish_draft.go` - `jobrunner/handlers/resolve_threads.go` - `jobrunner/handlers/send_fix_command.go` - `jobrunner/handlers/tick_parent.go` - `jobrunner/journal.go` - `jobrunner/poller.go` - `jobrunner/types.go` - `locales/embed.go` - `manifest/compile.go` - `manifest/loader.go` - `manifest/manifest.go` - `manifest/sign.go` - `marketplace/builder.go` - `marketplace/discovery.go` - `marketplace/installer.go` - `marketplace/marketplace.go` - `pkg/api/embed.go` - `pkg/api/provider.go` - `plugin/config.go` - `plugin/installer.go` - `plugin/loader.go` - `plugin/manifest.go` - `plugin/plugin.go` - `plugin/registry.go` - `repos/gitstate.go` - `repos/kbconfig.go` - `repos/registry.go` - `repos/workconfig.go` ## 2) What to look for in each file ### A) External input entry points - CLI entry arguments and flags - CLI env-config precedence (`config`, `FORGE_*`, `GITEA_*`) - HTTP request handlers, webhook payloads, request headers, params, and body - Filesystem paths and filenames supplied as config/flags/env/user values - External URLs, repo names, owner/org names, branch names - SSH command inputs and key material - Collector source inputs (search terms, query terms, package names, IDs) - Job signals/labels/checklists/issue fields/command payloads ### B) Validation gaps to record - Missing/weak allowlists for path segments and IDs - Missing canonicalisation of URLs/paths/refs before use - Incomplete enum/type validation (state, status, action) - Unsafe conversions/casts without bounds checks - Insufficient escaping for shell, file, JSON, or command composition - Trusting external API responses without schema checks - Ignoring error paths that should halt processing ### C) Injection vectors - OS command construction - Path traversal / arbitrary file write/read - Log forging / token leak via logging - SSRF via configurable URLs and webhooks - Template/HTML injection through rendered output paths - SQL-like or LDAP-like interpolation (where backend uses query strings) - Git command argument injection and branch/ref injection - Header injection in HTTP clients/servers ### D) Race condition risks - Shared mutable state in handlers and services - Journal writes/read/write-back without coordination - Async polling loops sharing token/counter/state - Map/slice writes from worker goroutines - TOCTOU around file presence/load/read-modify-write sequences - Cache + state refresh races under concurrent polling ## 3) Required output format for findings Each finding row must contain all columns in this exact order: `file:line | input source | flows into | validation | attack vector` Example format: `collect/github.go:142 | cmd flag `--org` (string) | buildGitHubCollector(config) -> net/http request URL | domain allowlist absent | SSRF + data exfil by domain override` ### Minimal per-row capture fields - `file:line` (primary function where input first enters) - `input source` (flag/env/body/path/header/id) - `flows into` (target function/call chain) - `validation` (what checks currently exist, if any) - `attack vector` (confidentiality/integrity/availability risk) ### Evidence fields (optional but preferred) - CWE id - Repro path - severity (`low`/`medium`/`high`) - confidence (`high`/`med`/`low`) - mitigation candidate ## 4) Where to write the report Primary report file to produce: - `docs/security/scan-attack-vector-mapping-report.md` This file must contain: 1. The completed matrix in the required format above 2. Deduplicated list of attack vectors with severity 3. Verification status (`mapped`, `validated`, `open` ) 4. A final summary by subsystem and risk priority This issue body should be replaced with a pointer plus a short runbook: - "Execution plan in `docs/plans/2026-03-24-security-attack-vector-mapping.md`, results in `docs/security/scan-attack-vector-mapping-report.md`." ## Execution order - [ ] 1. Create report file and add header + schema columns - [ ] 2. Scan files in package order listed above - [ ] 3. For each file, capture every external input entry point and map to sinks - [ ] 4. Populate one row per mapped flow in the required `file:line | ...` format - [ ] 5. Cross-check for duplicates and deduplicate by identical sink and attack vector - [ ] 6. Add severity and validation gap notes per row - [ ] 7. Finalise summary + high-priority follow-up list
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

-

Dependencies

No dependencies set.

Reference: core/go-scm#7
No description provided.