[scan] Security attack vector mapping #8

Open
opened 2026-03-23 12:54:01 +00:00 by Virgil · 1 comment
Member

Map every external input entry point: function, file:line, input source, flows into, validation, attack vector.


Implementation Plan (Spark)

[scan] Security attack vector mapping — implementation plan

Use this issue body as the execution plan for the scan.

1) Files to scan

Scope includes all source and test files in the repository, plus CLI/CI entry files that can change execution behaviour:

  • cmd/security/cmd_security.go
  • cmd/security/cmd_scan.go
  • cmd/security/cmd_jobs.go
  • cmd/security/cmd_alerts.go
  • cmd/security/cmd_deps.go
  • cmd/security/cmd_secrets.go
  • cmd/security/cmd.go
  • cmd/metrics/cmd.go
  • cmd/metrics/cmd_test.go
  • cmd/rag/cmd.go
  • cmd/lab/cmd_lab.go
  • cmd/embed-bench/main.go
  • ai/ai.go
  • ai/metrics.go
  • ai/rag.go
  • ai/metrics_test.go
  • ai/metrics_bench_test.go
  • .forgejo/workflows/security-scan.yml

2) What to inspect per file

For each file, map:

  • external input sources (CLI flags, environment-driven config, file paths, HTTP endpoints, command arguments, fixed data from external services)
  • validation gaps (length checks, enum checks, format checks, allow/deny lists)
  • injection vectors (command execution, network URL/path injection, log/format injection, header/body poisoning)
  • race conditions (shared state, non-locking read/write, mutable globals, shared clients)
  • data-flow paths from input to sensitive sink (process execution, file writes, HTTP calls, output rendering)

3) Output format for every finding

Add one row per finding in this order:

file:line input source flows into validation status attack vector
cmd/security/cmd_security.go:16 shared globals (securityTarget, securityRepo, securitySeverity, securityRegistryPath, securityJSON) all security subcommands Observed (via function-specific flag checks only) global mutable state can leak between runs if commands are reused in tests/tools
cmd/security/cmd_scan.go:27 --registry, --repo, --severity, --tool, --target, --json flags loadRegistry, filtering, API fetches, terminal output Partial (basic parsing, no enum/regex constraints for repo/target/severity) invalid owner/repo can poison endpoint construction and output context
cmd/security/cmd_alerts.go:21 same flags as above for alerts alerts fetchers and table output Partial same as scan path; secret/code/dependabot alert processing
cmd/security/cmd_deps.go:21 same flags as above for deps Dependabot fetcher and upgrade summary output Partial package/version strings can influence downstream output formatting
cmd/security/cmd_secrets.go:21 same flags as above for secrets secret alert fetcher and CLI output Partial secret metadata and state strings from API are rendered unsanitised
cmd/security/cmd_jobs.go:22 --targets, --issue-repo, --dry-run, --copies issue creation (gh issue create) and metric writes Partial (target format check only) command arg injection through crafted issue-repo and owner/repo labels; large targets payload can exhaust command output/report size
cmd/security/cmd_security.go:130 gh binary path lookup + external command execution runGHAPI runs gh api with endpoint arg Partial (no command output sanitisation) endpoint path includes user-controlled repo name; command execution boundary should be threat-modeled
cmd/security/cmd_security.go:108 securityRegistryPath string flag repos.LoadRegistry(io.Local, registryPath) and fallback discovery Partial path traversal/symlink and arbitrary file read concerns in registry loading path
cmd/security/cmd_security.go:194 target string (owner/repo) split validation all run*ForTarget entrypoints and API endpoint builders Weak (owner/repo only split, no char/class checks) malformed targets can influence external tool input (gh api, issue labels) and logs
cmd/security/cmd_security.go:300 repoFullName in GitHub endpoint builders fetch functions fetchDependabotAlerts/fetchCodeScanningAlerts/fetchSecretScanningAlerts Weak endpoint query parameters are hardcoded but path segment is untrusted input
cmd/lab/cmd_lab.go:42 --bind CLI flag http.Server{Addr: cfg.Addr} and route handlers Partial (no address validation) bind to unexpected interfaces/ports, local-privilege boundary exposure if cmd used in shared hosts
cmd/lab/cmd_lab.go:24 cfg from runtime config (cfg.ForgeURL, tokens, intervals, etc.) handler wiring, external collectors, HTTP polling intervals Unknown (external package implementation) trust boundary extends through external collectors and tokens (inspect downstream integrations separately)
cmd/embed-bench/main.go:28 --ollama URL flag HTTP client + URL concatenation for /api/embeddings, /api/tags Weak (no scheme/host allowlist, no host lock) SSRF/LAN pivot + plaintext credentials exposure if non-TLS endpoint is provided
cmd/embed-bench/main.go:223 TLS config on shared HTTP client (InsecureSkipVerify) all outbound requests Weak MITM risk and cert validation bypass for embedding calls
cmd/embed-bench/main.go:238 text payload from queries/memory arrays JSON POST body + decoding response None for memory strings; constant data low immediate risk in this file; still inspect response decode and size handling
cmd/metrics/cmd.go:31 --since string parseDuration -> time.Now().Add(-since) Partial unsupported token/large unit handling and duration overflow edge cases should be confirmed
cmd/metrics/cmd.go:105 --since value format parseDuration numeric parsing and unit switch Good for basic format, no bounds integer overflow and negative/zero edge path tested but should re-check for very large inputs
ai/metrics.go:46 metricsSince runtime-derived file path/time window Record/ReadEvents file writes and reads Weak write lock only for Record; concurrent ReadEvents may observe partially written lines without lock
ai/metrics.go:87 path date iteration from time.Time input readMetricsFile scanner loop Good (time-bounded) file lock semantics and scanner token limits should be reviewed under high-volume or malformed JSONL
ai/rag.go:22 TaskInfo.Title and TaskInfo.Description concatenated query -> RAG clients (rag.Query) No input validation in this package prompt/injection style input to model/RAG service; verify client side escaping expectations
ai/metrics_test.go / ai/metrics_bench_test.go / cmd/metrics/cmd_test.go test data and temp env vars all public metric APIs N/A ensure test helpers cannot influence production code paths through build tags or shared env assumptions

4) Where to write the report

  • Write the completed mapping results to:
    • SECURITY_ATTACK_VECTOR_MAPPING_REPORT.md
  • Keep rows in the schema above.
  • Include an executive summary first, then:
    • High
    • Medium
    • Low
    • No issue

Execution steps (for Codex agent)

  1. Start at cmd/security/cmd_security.go, then expand into each security subcommand file and shared helpers.
  2. Validate every exec.Command call boundary (gh, lab config flow, issue creation) and enumerate argument construction.
  3. Verify path/registry loading and target parsing for format and traversal/command/endpoint poisoning.
  4. Inspect outbound network boundaries (embed-bench, ai/rag, lab collectors via cfg) for trust boundaries and hardening.
  5. Review concurrency and shared state: metricsMu, shared CLI variables, and test overrides.
  6. Populate the report table for every confirmed external input point and include evidence references (file:line + snippet context).
  7. Confirm output format and severity in final issue body, then submit as a follow-up comment/reference to this issue.
Map every external input entry point: function, file:line, input source, flows into, validation, attack vector. --- ## Implementation Plan (Spark) # [scan] Security attack vector mapping — implementation plan Use this issue body as the execution plan for the scan. ## 1) Files to scan Scope includes all source and test files in the repository, plus CLI/CI entry files that can change execution behaviour: - `cmd/security/cmd_security.go` - `cmd/security/cmd_scan.go` - `cmd/security/cmd_jobs.go` - `cmd/security/cmd_alerts.go` - `cmd/security/cmd_deps.go` - `cmd/security/cmd_secrets.go` - `cmd/security/cmd.go` - `cmd/metrics/cmd.go` - `cmd/metrics/cmd_test.go` - `cmd/rag/cmd.go` - `cmd/lab/cmd_lab.go` - `cmd/embed-bench/main.go` - `ai/ai.go` - `ai/metrics.go` - `ai/rag.go` - `ai/metrics_test.go` - `ai/metrics_bench_test.go` - `.forgejo/workflows/security-scan.yml` ## 2) What to inspect per file For each file, map: - external input sources (CLI flags, environment-driven config, file paths, HTTP endpoints, command arguments, fixed data from external services) - validation gaps (length checks, enum checks, format checks, allow/deny lists) - injection vectors (command execution, network URL/path injection, log/format injection, header/body poisoning) - race conditions (shared state, non-locking read/write, mutable globals, shared clients) - data-flow paths from input to sensitive sink (process execution, file writes, HTTP calls, output rendering) ## 3) Output format for every finding Add one row per finding in this order: | file:line | input source | flows into | validation status | attack vector | |---|---|---|---|---| | `cmd/security/cmd_security.go:16` | shared globals (`securityTarget`, `securityRepo`, `securitySeverity`, `securityRegistryPath`, `securityJSON`) | all security subcommands | **Observed** (via function-specific flag checks only) | global mutable state can leak between runs if commands are reused in tests/tools | | `cmd/security/cmd_scan.go:27` | `--registry`, `--repo`, `--severity`, `--tool`, `--target`, `--json` flags | `loadRegistry`, filtering, API fetches, terminal output | **Partial** (basic parsing, no enum/regex constraints for repo/target/severity) | invalid owner/repo can poison endpoint construction and output context | | `cmd/security/cmd_alerts.go:21` | same flags as above for `alerts` | alerts fetchers and table output | **Partial** | same as scan path; secret/code/dependabot alert processing | | `cmd/security/cmd_deps.go:21` | same flags as above for `deps` | Dependabot fetcher and upgrade summary output | **Partial** | package/version strings can influence downstream output formatting | | `cmd/security/cmd_secrets.go:21` | same flags as above for `secrets` | secret alert fetcher and CLI output | **Partial** | secret metadata and state strings from API are rendered unsanitised | | `cmd/security/cmd_jobs.go:22` | `--targets`, `--issue-repo`, `--dry-run`, `--copies` | issue creation (`gh issue create`) and metric writes | **Partial** (`target` format check only) | command arg injection through crafted issue-repo and owner/repo labels; large `targets` payload can exhaust command output/report size | | `cmd/security/cmd_security.go:130` | `gh` binary path lookup + external command execution | `runGHAPI` runs `gh api` with `endpoint` arg | **Partial** (no command output sanitisation) | endpoint path includes user-controlled repo name; command execution boundary should be threat-modeled | | `cmd/security/cmd_security.go:108` | `securityRegistryPath` string flag | `repos.LoadRegistry(io.Local, registryPath)` and fallback discovery | **Partial** | path traversal/symlink and arbitrary file read concerns in registry loading path | | `cmd/security/cmd_security.go:194` | `target` string (owner/repo) split validation | all `run*ForTarget` entrypoints and API endpoint builders | **Weak** (`owner/repo` only split, no char/class checks) | malformed targets can influence external tool input (`gh api`, issue labels) and logs | | `cmd/security/cmd_security.go:300` | `repoFullName` in GitHub endpoint builders | fetch functions `fetchDependabotAlerts`/`fetchCodeScanningAlerts`/`fetchSecretScanningAlerts` | **Weak** | endpoint query parameters are hardcoded but path segment is untrusted input | | `cmd/lab/cmd_lab.go:42` | `--bind` CLI flag | `http.Server{Addr: cfg.Addr}` and route handlers | **Partial** (no address validation) | bind to unexpected interfaces/ports, local-privilege boundary exposure if cmd used in shared hosts | | `cmd/lab/cmd_lab.go:24` | `cfg` from runtime config (`cfg.ForgeURL`, tokens, intervals, etc.) | handler wiring, external collectors, HTTP polling intervals | **Unknown** (external package implementation) | trust boundary extends through external collectors and tokens (inspect downstream integrations separately) | | `cmd/embed-bench/main.go:28` | `--ollama` URL flag | HTTP client + URL concatenation for `/api/embeddings`, `/api/tags` | **Weak** (no scheme/host allowlist, no host lock) | SSRF/LAN pivot + plaintext credentials exposure if non-TLS endpoint is provided | | `cmd/embed-bench/main.go:223` | TLS config on shared HTTP client (`InsecureSkipVerify`) | all outbound requests | **Weak** | MITM risk and cert validation bypass for embedding calls | | `cmd/embed-bench/main.go:238` | text payload from `queries`/memory arrays | JSON POST body + decoding response | **None** for memory strings; constant data | low immediate risk in this file; still inspect response decode and size handling | | `cmd/metrics/cmd.go:31` | `--since` string | `parseDuration` -> `time.Now().Add(-since)` | **Partial** | unsupported token/large unit handling and duration overflow edge cases should be confirmed | | `cmd/metrics/cmd.go:105` | `--since` value format | `parseDuration` numeric parsing and unit switch | **Good for basic format**, no bounds | integer overflow and negative/zero edge path tested but should re-check for very large inputs | | `ai/metrics.go:46` | `metricsSince` runtime-derived file path/time window | `Record`/`ReadEvents` file writes and reads | **Weak** | write lock only for `Record`; concurrent `ReadEvents` may observe partially written lines without lock | | `ai/metrics.go:87` | path date iteration from `time.Time` input | `readMetricsFile` scanner loop | **Good** (time-bounded) | file lock semantics and scanner token limits should be reviewed under high-volume or malformed JSONL | | `ai/rag.go:22` | `TaskInfo.Title` and `TaskInfo.Description` | concatenated query -> RAG clients (`rag.Query`) | **No input validation in this package** | prompt/injection style input to model/RAG service; verify client side escaping expectations | | `ai/metrics_test.go` / `ai/metrics_bench_test.go` / `cmd/metrics/cmd_test.go` | test data and temp env vars | all public metric APIs | **N/A** | ensure test helpers cannot influence production code paths through build tags or shared env assumptions | ## 4) Where to write the report - Write the completed mapping results to: - `SECURITY_ATTACK_VECTOR_MAPPING_REPORT.md` - Keep rows in the schema above. - Include an executive summary first, then: - **High** - **Medium** - **Low** - **No issue** ## Execution steps (for Codex agent) 1. Start at `cmd/security/cmd_security.go`, then expand into each security subcommand file and shared helpers. 2. Validate every `exec.Command` call boundary (`gh`, lab config flow, issue creation) and enumerate argument construction. 3. Verify path/registry loading and `target` parsing for format and traversal/command/endpoint poisoning. 4. Inspect outbound network boundaries (`embed-bench`, `ai/rag`, lab collectors via `cfg`) for trust boundaries and hardening. 5. Review concurrency and shared state: `metricsMu`, shared CLI variables, and test overrides. 6. Populate the report table for every confirmed external input point and include evidence references (`file:line` + snippet context). 7. Confirm output format and severity in final issue body, then submit as a follow-up comment/reference to this issue.
Author
Member

Security Scan: Attack Vector Map completed. Details in agent log.

## Security Scan: Attack Vector Map completed. Details in agent log.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

-

Dependencies

No dependencies set.

Reference: core/go-ai#8
No description provided.