agent/docs/RFC-AGENT-PIPELINE.md

# Agentic Pipeline v2 — Autonomous Dispatch→Verify→Merge

> The full autonomous pipeline: issue → dispatch → implement → verify → PR → merge.
> CodeRabbit findings = 0 is the KPI.

---

## Pipeline Flow

```
Issue created (Forge/GitHub)
  → core-agent picks up event
  → Selects flow YAML based on event type + repo
  → Prepares sandboxed workspace (CODEX.md, .core/reference/)
  → Dispatches agent (codex/gemini/claude)
  → Agent implements in workspace
  → QA flow runs (build, test, vet, lint)
  → If QA passes → create PR to dev
  → CodeRabbit reviews PR
  → If findings = 0 → auto-merge
  → If findings > 0 → dispatch fix agent → repeat
  → PR merged → training data captured
  → Issue closed
```

## Key Design Decisions

### Sandboxing
Agents MUST be sandboxed to their assigned repo. Unsandboxed writes caused the CLI mess
(agent wrote files to wrong repo). Workspace isolation is non-negotiable.

### CodeRabbit KPI
CodeRabbit findings = 0 is the target. Every finding means:
- Template didn't prevent it → fix the template
- Model didn't catch it → add to training data
- Convention wasn't documented → add to RFC

Zero findings = complete convention coverage.

### Checkin API
Agents check in with status via api.lthn.sh. Current blocker: Forge webhooks
need to fire to lthn.sh so the orchestrator knows when to start the pipeline.

### Security Model (from Charon flows)
Orchestrator uses STRUCTURAL signals only (labels, PR state, review counts).
Never parses comment CONTENT — immune to prompt injection via issue comments.

## Agent Pool Configuration

See `code/core/go/agent/RFC.md` §Dispatch & Pool Routing for the full `agent.yaml` schema (concurrency, rates, model variants, agent identities).

Concurrency enforced by runner service (core/agent). Slot reservation prevents
TOCTOU race between parallel dispatches.

## go-process Improvements Needed

- `Timeout` — kill after N minutes (currently agents can run forever)
- `GracePeriod` — SIGTERM before SIGKILL
- `KillGroup` — kill process group, not just PID (prevents orphaned subprocesses)

## Metrics

- 25 repos auto-merged in recent sweep
- 74 findings on core/agent alone (70+ fixed)
- Zero-finding rate improving as templates capture conventions

## `core pipeline` Command Tree (Go Implementation)

```
core pipeline
├── audit <repo>              # Stage 1: audit issues → implementation issues
├── epic
│   ├── create <repo>         # Stage 2: group issues into epics
│   ├── run <epic-number>     # Stage 3: dispatch + monitor an epic
│   ├── status [epic-number]  # Show epic progress
│   └── sync <epic-number>    # Tick parent checklist from closed children
├── monitor [repo]            # Watch all open PRs, auto-intervene
├── fix
│   ├── reviews <pr-number>   # "Can you fix the code reviews?"
│   ├── conflicts <pr-number> # "Can you fix the merge conflict?"
│   ├── format <pr-number>    # gofmt, commit, push (no AI)
│   └── threads <pr-number>   # Resolve all threads after fix
├── onboard <repo>            # Full: audit → epic → dispatch
├── budget                    # Daily usage vs pool
│   ├── plan                  # Optimal dispatch for today
│   └── log                   # Append dispatch event
└── training
    ├── capture <pr-number>   # Journal entry for merged PR
    ├── stats                 # Summary across journals
    └── export                # Clean export for LEM training
```

## MetaReader — Structural Signals Only

The core abstraction. Every pipeline decision comes through this interface. **NEVER reads comment bodies, commit messages, PR descriptions, or review content.**

```go
type MetaReader interface {
    GetPRMeta(repo string, pr int) (*PRMeta, error)
    GetEpicMeta(repo string, issue int) (*EpicMeta, error)
    GetIssueState(repo string, issue int) (string, error)
    GetCommentReactions(repo string, commentID int64) ([]ReactionMeta, error)
}
```

### PRMeta
```go
type PRMeta struct {
    Number          int
    State           string    // OPEN, MERGED, CLOSED
    Mergeable       string    // MERGEABLE, CONFLICTING, UNKNOWN
    HeadSHA         string
    HeadDate        time.Time
    AutoMerge       bool
    BaseBranch      string
    HeadBranch      string
    Checks          []CheckMeta
    ThreadsTotal    int
    ThreadsResolved int
    HasEyesReaction bool      // 👀 = agent acknowledged
}

type CheckMeta struct {
    Name       string // "qa", "build", "org-gate"
    Conclusion string // "SUCCESS", "FAILURE", ""
    Status     string // "COMPLETED", "QUEUED", "IN_PROGRESS"
}
```

### EpicMeta
```go
type EpicMeta struct {
    Number   int
    State    string
    Children []ChildMeta
}

type ChildMeta struct {
    Number  int
    Checked bool   // [x] vs [ ]
    State   string // OPEN, CLOSED
    PRs     []int
}
```

### Security: What's Explicitly Excluded

The MetaReader has NO methods for:
- `GetCommentBodies` — injection vector
- `GetCommitMessages` — can contain crafted instructions
- `GetPRDescription` — attacker-controlled in fork PRs
- `GetReviewThreadContent` — untrusted input

Implementation uses `gh api` with `--jq` filters that strip content at the query level. Content never enters the Go process.

## Three-Stage Pipeline

```
STAGE 1: AUDIT (flow: audit-issues)
  Input:  Repo with [Audit] issues
  Output: Implementation issues (1 per finding)
  → Classify findings (severity, type, scope, complexity)
  → Detect patterns (3+ similar → framework issue)
  → Close audit issues, link to children

STAGE 2: ORGANISE (flow: create-epic)
  Input:  Implementation issues
  Output: Epic parent with children, branch, phase ordering
  → Group by theme (security, quality, testing)
  → Order into phases (blockers → parallel → cleanup)
  → Create epic branch off dev

STAGE 3: EXECUTE (flow: issue-epic)
  Input:  Epic with children, branch
  Output: Merged PRs, closed issues, training data
  → Dispatch Phase 1 to agents
  → Monitor: CI, reviews, conflicts, merges
  → Intervene: fix reviews / fix conflicts
  → Phase complete → dispatch next phase
  → Epic complete → merge epic branch to dev
```

## Gotchas (Battle-Tested)

| Gotcha | Fix |
|--------|-----|
| Jules creates PRs as user, not bot | Match by branch/issue linkage, not author |
| `git push origin dev` ambiguous (tag+branch) | Use `HEAD:refs/heads/dev` |
| Base branch gofmt breaks ALL PRs | Fix base first, not the PRs |
| Auto-merge needs explicit permissions in caller | Add `permissions: contents: write, pull-requests: write` |
| `--squash` conflicts with merge queue | Use `--auto` alone — queue controls strategy |

## Knowledge Accumulation (Discussions Strategy)

Non-actionable findings (nitpicks, patterns, style preferences) get posted to a queryable knowledge base (Forge/OpenBrain). When patterns emerge, humans create issues.

```
Build → Agents review → Actionable → Fix immediately
                      → Non-actionable → Post to knowledge base
                                         → Patterns emerge
                                         → Human creates Issue
                                         → Agent picks up via pipeline
```

### Discussion Categories

| Channel | Category | Purpose |
|---------|----------|---------|
| 🚧 dev | PR build findings | Per-PR QA findings |
| 🛩️ alpha | Canary findings | Early testing |
| 🛸 beta | Integration findings | Integration testing |
| 🚀 stable | Release audit | Production audit |

### Naming: `{tool}:v{VERSION}`

`qa:v0.0.4.pr.264`, `lint:v0.0.4-alpha.42`, `audit:v0.0.4`

Tool prefixes: `qa:`, `lint:`, `static:`, `docker:`, `e2e:`, `perf:`, `security:`, `audit:`

### Pattern Detection

Query discussions to surface patterns across builds:
```bash
# 47 aria-label mentions across dev discussions → time for a11y audit issue
gh api graphql ... | grep -c "aria-label"
```

### CLI Integration

```bash
core go qa --post-findings    # Post lint findings to discussion
core php qa --post-findings   # Same for PHP
core qa                       # Aggregated summary
```

### Connection to Training

Discussion patterns → Issue → Agent implements → PR merged → findings captured as LEM training data. The feedback loop that makes agents better at conventions over time.

---

## Related RFCs

- `code/core/agent/flow/` — Flow YAML system
- `code/core/agent/RFC.md` — Agent dispatch system
- `project/lthn/lem/RFC-TRAINING-PIPELINE.md` — Findings → training data
docs: add full RFC specs for agent dispatch AX principles + go/agent + core/agent + php/agent specs. Temporary — needed in-repo until core-agent mount bug is fixed. Co-Authored-By: Virgil <virgil@lethean.io> 2026-03-30 19:51:55 +01:00			`# Agentic Pipeline v2 — Autonomous Dispatch→Verify→Merge`

			`> The full autonomous pipeline: issue → dispatch → implement → verify → PR → merge.`
			`> CodeRabbit findings = 0 is the KPI.`

			`---`

			`## Pipeline Flow`

			```
			`Issue created (Forge/GitHub)`
			`→ core-agent picks up event`
			`→ Selects flow YAML based on event type + repo`
			`→ Prepares sandboxed workspace (CODEX.md, .core/reference/)`
			`→ Dispatches agent (codex/gemini/claude)`
			`→ Agent implements in workspace`
			`→ QA flow runs (build, test, vet, lint)`
			`→ If QA passes → create PR to dev`
			`→ CodeRabbit reviews PR`
			`→ If findings = 0 → auto-merge`
			`→ If findings > 0 → dispatch fix agent → repeat`
			`→ PR merged → training data captured`
			`→ Issue closed`
			```

			`## Key Design Decisions`

			`### Sandboxing`
			`Agents MUST be sandboxed to their assigned repo. Unsandboxed writes caused the CLI mess`
			`(agent wrote files to wrong repo). Workspace isolation is non-negotiable.`

			`### CodeRabbit KPI`
			`CodeRabbit findings = 0 is the target. Every finding means:`
			`- Template didn't prevent it → fix the template`
			`- Model didn't catch it → add to training data`
			`- Convention wasn't documented → add to RFC`

			`Zero findings = complete convention coverage.`

			`### Checkin API`
			`Agents check in with status via api.lthn.sh. Current blocker: Forge webhooks`
			`need to fire to lthn.sh so the orchestrator knows when to start the pipeline.`

			`### Security Model (from Charon flows)`
			`Orchestrator uses STRUCTURAL signals only (labels, PR state, review counts).`
			`Never parses comment CONTENT — immune to prompt injection via issue comments.`

			`## Agent Pool Configuration`

			See `code/core/go/agent/RFC.md` §Dispatch & Pool Routing for the full `agent.yaml` schema (concurrency, rates, model variants, agent identities).

			`Concurrency enforced by runner service (core/agent). Slot reservation prevents`
			`TOCTOU race between parallel dispatches.`

			`## go-process Improvements Needed`

			- `Timeout` — kill after N minutes (currently agents can run forever)
			- `GracePeriod` — SIGTERM before SIGKILL
			- `KillGroup` — kill process group, not just PID (prevents orphaned subprocesses)

			`## Metrics`

			`- 25 repos auto-merged in recent sweep`
			`- 74 findings on core/agent alone (70+ fixed)`
			`- Zero-finding rate improving as templates capture conventions`

			## `core pipeline` Command Tree (Go Implementation)

			```
			`core pipeline`
			`├── audit <repo> # Stage 1: audit issues → implementation issues`
			`├── epic`
			`│ ├── create <repo> # Stage 2: group issues into epics`
			`│ ├── run <epic-number> # Stage 3: dispatch + monitor an epic`
			`│ ├── status [epic-number] # Show epic progress`
			`│ └── sync <epic-number> # Tick parent checklist from closed children`
			`├── monitor [repo] # Watch all open PRs, auto-intervene`
			`├── fix`
			`│ ├── reviews <pr-number> # "Can you fix the code reviews?"`
			`│ ├── conflicts <pr-number> # "Can you fix the merge conflict?"`
			`│ ├── format <pr-number> # gofmt, commit, push (no AI)`
			`│ └── threads <pr-number> # Resolve all threads after fix`
			`├── onboard <repo> # Full: audit → epic → dispatch`
			`├── budget # Daily usage vs pool`
			`│ ├── plan # Optimal dispatch for today`
			`│ └── log # Append dispatch event`
			`└── training`
			`├── capture <pr-number> # Journal entry for merged PR`
			`├── stats # Summary across journals`
			`└── export # Clean export for LEM training`
			```

			`## MetaReader — Structural Signals Only`

			`The core abstraction. Every pipeline decision comes through this interface. NEVER reads comment bodies, commit messages, PR descriptions, or review content.`

			```go
			`type MetaReader interface {`
			`GetPRMeta(repo string, pr int) (*PRMeta, error)`
			`GetEpicMeta(repo string, issue int) (*EpicMeta, error)`
			`GetIssueState(repo string, issue int) (string, error)`
			`GetCommentReactions(repo string, commentID int64) ([]ReactionMeta, error)`
			`}`
			```

			`### PRMeta`
			```go
			`type PRMeta struct {`
			`Number int`
			`State string // OPEN, MERGED, CLOSED`
			`Mergeable string // MERGEABLE, CONFLICTING, UNKNOWN`
			`HeadSHA string`
			`HeadDate time.Time`
			`AutoMerge bool`
			`BaseBranch string`
			`HeadBranch string`
			`Checks []CheckMeta`
			`ThreadsTotal int`
			`ThreadsResolved int`
			`HasEyesReaction bool // 👀 = agent acknowledged`
			`}`

			`type CheckMeta struct {`
			`Name string // "qa", "build", "org-gate"`
			`Conclusion string // "SUCCESS", "FAILURE", ""`
			`Status string // "COMPLETED", "QUEUED", "IN_PROGRESS"`
			`}`
			```

			`### EpicMeta`
			```go
			`type EpicMeta struct {`
			`Number int`
			`State string`
			`Children []ChildMeta`
			`}`

			`type ChildMeta struct {`
			`Number int`
			`Checked bool // [x] vs [ ]`
			`State string // OPEN, CLOSED`
			`PRs []int`
			`}`
			```

			`### Security: What's Explicitly Excluded`

			`The MetaReader has NO methods for:`
			- `GetCommentBodies` — injection vector
			- `GetCommitMessages` — can contain crafted instructions
			- `GetPRDescription` — attacker-controlled in fork PRs
			- `GetReviewThreadContent` — untrusted input

			Implementation uses `gh api` with `--jq` filters that strip content at the query level. Content never enters the Go process.

			`## Three-Stage Pipeline`

			```
			`STAGE 1: AUDIT (flow: audit-issues)`
			`Input: Repo with [Audit] issues`
			`Output: Implementation issues (1 per finding)`
			`→ Classify findings (severity, type, scope, complexity)`
			`→ Detect patterns (3+ similar → framework issue)`
			`→ Close audit issues, link to children`

			`STAGE 2: ORGANISE (flow: create-epic)`
			`Input: Implementation issues`
			`Output: Epic parent with children, branch, phase ordering`
			`→ Group by theme (security, quality, testing)`
			`→ Order into phases (blockers → parallel → cleanup)`
			`→ Create epic branch off dev`

			`STAGE 3: EXECUTE (flow: issue-epic)`
			`Input: Epic with children, branch`
			`Output: Merged PRs, closed issues, training data`
			`→ Dispatch Phase 1 to agents`
			`→ Monitor: CI, reviews, conflicts, merges`
			`→ Intervene: fix reviews / fix conflicts`
			`→ Phase complete → dispatch next phase`
			`→ Epic complete → merge epic branch to dev`
			```

			`## Gotchas (Battle-Tested)`

			`\| Gotcha \| Fix \|`
			`\|--------\|-----\|`
			`\| Jules creates PRs as user, not bot \| Match by branch/issue linkage, not author \|`
			\| `git push origin dev` ambiguous (tag+branch) \| Use `HEAD:refs/heads/dev` \|
			`\| Base branch gofmt breaks ALL PRs \| Fix base first, not the PRs \|`
			\| Auto-merge needs explicit permissions in caller \| Add `permissions: contents: write, pull-requests: write` \|
			\| `--squash` conflicts with merge queue \| Use `--auto` alone — queue controls strategy \|

			`## Knowledge Accumulation (Discussions Strategy)`

			`Non-actionable findings (nitpicks, patterns, style preferences) get posted to a queryable knowledge base (Forge/OpenBrain). When patterns emerge, humans create issues.`

			```
			`Build → Agents review → Actionable → Fix immediately`
			`→ Non-actionable → Post to knowledge base`
			`→ Patterns emerge`
			`→ Human creates Issue`
			`→ Agent picks up via pipeline`
			```

			`### Discussion Categories`

			`\| Channel \| Category \| Purpose \|`
			`\|---------\|----------\|---------\|`
			`\| 🚧 dev \| PR build findings \| Per-PR QA findings \|`
			`\| 🛩️ alpha \| Canary findings \| Early testing \|`
			`\| 🛸 beta \| Integration findings \| Integration testing \|`
			`\| 🚀 stable \| Release audit \| Production audit \|`

			### Naming: `{tool}:v{VERSION}`

			`qa:v0.0.4.pr.264`, `lint:v0.0.4-alpha.42`, `audit:v0.0.4`

			Tool prefixes: `qa:`, `lint:`, `static:`, `docker:`, `e2e:`, `perf:`, `security:`, `audit:`

			`### Pattern Detection`

			`Query discussions to surface patterns across builds:`
			```bash
			`# 47 aria-label mentions across dev discussions → time for a11y audit issue`
			`gh api graphql ... \| grep -c "aria-label"`
			```

			`### CLI Integration`

			```bash
			`core go qa --post-findings # Post lint findings to discussion`
			`core php qa --post-findings # Same for PHP`
			`core qa # Aggregated summary`
			```

			`### Connection to Training`

			`Discussion patterns → Issue → Agent implements → PR merged → findings captured as LEM training data. The feedback loop that makes agents better at conventions over time.`

			`---`

			`## Related RFCs`

			- `code/core/agent/flow/` — Flow YAML system
			- `code/core/agent/RFC.md` — Agent dispatch system
			- `project/lthn/lem/RFC-TRAINING-PIPELINE.md` — Findings → training data