247 lines
8.5 KiB
Markdown
247 lines
8.5 KiB
Markdown
|
|
# Agentic Pipeline v2 — Autonomous Dispatch→Verify→Merge
|
||
|
|
|
||
|
|
> The full autonomous pipeline: issue → dispatch → implement → verify → PR → merge.
|
||
|
|
> CodeRabbit findings = 0 is the KPI.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Pipeline Flow
|
||
|
|
|
||
|
|
```
|
||
|
|
Issue created (Forge/GitHub)
|
||
|
|
→ core-agent picks up event
|
||
|
|
→ Selects flow YAML based on event type + repo
|
||
|
|
→ Prepares sandboxed workspace (CODEX.md, .core/reference/)
|
||
|
|
→ Dispatches agent (codex/gemini/claude)
|
||
|
|
→ Agent implements in workspace
|
||
|
|
→ QA flow runs (build, test, vet, lint)
|
||
|
|
→ If QA passes → create PR to dev
|
||
|
|
→ CodeRabbit reviews PR
|
||
|
|
→ If findings = 0 → auto-merge
|
||
|
|
→ If findings > 0 → dispatch fix agent → repeat
|
||
|
|
→ PR merged → training data captured
|
||
|
|
→ Issue closed
|
||
|
|
```
|
||
|
|
|
||
|
|
## Key Design Decisions
|
||
|
|
|
||
|
|
### Sandboxing
|
||
|
|
Agents MUST be sandboxed to their assigned repo. Unsandboxed writes caused the CLI mess
|
||
|
|
(agent wrote files to wrong repo). Workspace isolation is non-negotiable.
|
||
|
|
|
||
|
|
### CodeRabbit KPI
|
||
|
|
CodeRabbit findings = 0 is the target. Every finding means:
|
||
|
|
- Template didn't prevent it → fix the template
|
||
|
|
- Model didn't catch it → add to training data
|
||
|
|
- Convention wasn't documented → add to RFC
|
||
|
|
|
||
|
|
Zero findings = complete convention coverage.
|
||
|
|
|
||
|
|
### Checkin API
|
||
|
|
Agents check in with status via api.lthn.sh. Current blocker: Forge webhooks
|
||
|
|
need to fire to lthn.sh so the orchestrator knows when to start the pipeline.
|
||
|
|
|
||
|
|
### Security Model (from Charon flows)
|
||
|
|
Orchestrator uses STRUCTURAL signals only (labels, PR state, review counts).
|
||
|
|
Never parses comment CONTENT — immune to prompt injection via issue comments.
|
||
|
|
|
||
|
|
## Agent Pool Configuration
|
||
|
|
|
||
|
|
See `code/core/go/agent/RFC.md` §Dispatch & Pool Routing for the full `agent.yaml` schema (concurrency, rates, model variants, agent identities).
|
||
|
|
|
||
|
|
Concurrency enforced by runner service (core/agent). Slot reservation prevents
|
||
|
|
TOCTOU race between parallel dispatches.
|
||
|
|
|
||
|
|
## go-process Improvements Needed
|
||
|
|
|
||
|
|
- `Timeout` — kill after N minutes (currently agents can run forever)
|
||
|
|
- `GracePeriod` — SIGTERM before SIGKILL
|
||
|
|
- `KillGroup` — kill process group, not just PID (prevents orphaned subprocesses)
|
||
|
|
|
||
|
|
## Metrics
|
||
|
|
|
||
|
|
- 25 repos auto-merged in recent sweep
|
||
|
|
- 74 findings on core/agent alone (70+ fixed)
|
||
|
|
- Zero-finding rate improving as templates capture conventions
|
||
|
|
|
||
|
|
## `core pipeline` Command Tree (Go Implementation)
|
||
|
|
|
||
|
|
```
|
||
|
|
core pipeline
|
||
|
|
├── audit <repo> # Stage 1: audit issues → implementation issues
|
||
|
|
├── epic
|
||
|
|
│ ├── create <repo> # Stage 2: group issues into epics
|
||
|
|
│ ├── run <epic-number> # Stage 3: dispatch + monitor an epic
|
||
|
|
│ ├── status [epic-number] # Show epic progress
|
||
|
|
│ └── sync <epic-number> # Tick parent checklist from closed children
|
||
|
|
├── monitor [repo] # Watch all open PRs, auto-intervene
|
||
|
|
├── fix
|
||
|
|
│ ├── reviews <pr-number> # "Can you fix the code reviews?"
|
||
|
|
│ ├── conflicts <pr-number> # "Can you fix the merge conflict?"
|
||
|
|
│ ├── format <pr-number> # gofmt, commit, push (no AI)
|
||
|
|
│ └── threads <pr-number> # Resolve all threads after fix
|
||
|
|
├── onboard <repo> # Full: audit → epic → dispatch
|
||
|
|
├── budget # Daily usage vs pool
|
||
|
|
│ ├── plan # Optimal dispatch for today
|
||
|
|
│ └── log # Append dispatch event
|
||
|
|
└── training
|
||
|
|
├── capture <pr-number> # Journal entry for merged PR
|
||
|
|
├── stats # Summary across journals
|
||
|
|
└── export # Clean export for LEM training
|
||
|
|
```
|
||
|
|
|
||
|
|
## MetaReader — Structural Signals Only
|
||
|
|
|
||
|
|
The core abstraction. Every pipeline decision comes through this interface. **NEVER reads comment bodies, commit messages, PR descriptions, or review content.**
|
||
|
|
|
||
|
|
```go
|
||
|
|
type MetaReader interface {
|
||
|
|
GetPRMeta(repo string, pr int) (*PRMeta, error)
|
||
|
|
GetEpicMeta(repo string, issue int) (*EpicMeta, error)
|
||
|
|
GetIssueState(repo string, issue int) (string, error)
|
||
|
|
GetCommentReactions(repo string, commentID int64) ([]ReactionMeta, error)
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
### PRMeta
|
||
|
|
```go
|
||
|
|
type PRMeta struct {
|
||
|
|
Number int
|
||
|
|
State string // OPEN, MERGED, CLOSED
|
||
|
|
Mergeable string // MERGEABLE, CONFLICTING, UNKNOWN
|
||
|
|
HeadSHA string
|
||
|
|
HeadDate time.Time
|
||
|
|
AutoMerge bool
|
||
|
|
BaseBranch string
|
||
|
|
HeadBranch string
|
||
|
|
Checks []CheckMeta
|
||
|
|
ThreadsTotal int
|
||
|
|
ThreadsResolved int
|
||
|
|
HasEyesReaction bool // 👀 = agent acknowledged
|
||
|
|
}
|
||
|
|
|
||
|
|
type CheckMeta struct {
|
||
|
|
Name string // "qa", "build", "org-gate"
|
||
|
|
Conclusion string // "SUCCESS", "FAILURE", ""
|
||
|
|
Status string // "COMPLETED", "QUEUED", "IN_PROGRESS"
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
### EpicMeta
|
||
|
|
```go
|
||
|
|
type EpicMeta struct {
|
||
|
|
Number int
|
||
|
|
State string
|
||
|
|
Children []ChildMeta
|
||
|
|
}
|
||
|
|
|
||
|
|
type ChildMeta struct {
|
||
|
|
Number int
|
||
|
|
Checked bool // [x] vs [ ]
|
||
|
|
State string // OPEN, CLOSED
|
||
|
|
PRs []int
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
### Security: What's Explicitly Excluded
|
||
|
|
|
||
|
|
The MetaReader has NO methods for:
|
||
|
|
- `GetCommentBodies` — injection vector
|
||
|
|
- `GetCommitMessages` — can contain crafted instructions
|
||
|
|
- `GetPRDescription` — attacker-controlled in fork PRs
|
||
|
|
- `GetReviewThreadContent` — untrusted input
|
||
|
|
|
||
|
|
Implementation uses `gh api` with `--jq` filters that strip content at the query level. Content never enters the Go process.
|
||
|
|
|
||
|
|
## Three-Stage Pipeline
|
||
|
|
|
||
|
|
```
|
||
|
|
STAGE 1: AUDIT (flow: audit-issues)
|
||
|
|
Input: Repo with [Audit] issues
|
||
|
|
Output: Implementation issues (1 per finding)
|
||
|
|
→ Classify findings (severity, type, scope, complexity)
|
||
|
|
→ Detect patterns (3+ similar → framework issue)
|
||
|
|
→ Close audit issues, link to children
|
||
|
|
|
||
|
|
STAGE 2: ORGANISE (flow: create-epic)
|
||
|
|
Input: Implementation issues
|
||
|
|
Output: Epic parent with children, branch, phase ordering
|
||
|
|
→ Group by theme (security, quality, testing)
|
||
|
|
→ Order into phases (blockers → parallel → cleanup)
|
||
|
|
→ Create epic branch off dev
|
||
|
|
|
||
|
|
STAGE 3: EXECUTE (flow: issue-epic)
|
||
|
|
Input: Epic with children, branch
|
||
|
|
Output: Merged PRs, closed issues, training data
|
||
|
|
→ Dispatch Phase 1 to agents
|
||
|
|
→ Monitor: CI, reviews, conflicts, merges
|
||
|
|
→ Intervene: fix reviews / fix conflicts
|
||
|
|
→ Phase complete → dispatch next phase
|
||
|
|
→ Epic complete → merge epic branch to dev
|
||
|
|
```
|
||
|
|
|
||
|
|
## Gotchas (Battle-Tested)
|
||
|
|
|
||
|
|
| Gotcha | Fix |
|
||
|
|
|--------|-----|
|
||
|
|
| Jules creates PRs as user, not bot | Match by branch/issue linkage, not author |
|
||
|
|
| `git push origin dev` ambiguous (tag+branch) | Use `HEAD:refs/heads/dev` |
|
||
|
|
| Base branch gofmt breaks ALL PRs | Fix base first, not the PRs |
|
||
|
|
| Auto-merge needs explicit permissions in caller | Add `permissions: contents: write, pull-requests: write` |
|
||
|
|
| `--squash` conflicts with merge queue | Use `--auto` alone — queue controls strategy |
|
||
|
|
|
||
|
|
## Knowledge Accumulation (Discussions Strategy)
|
||
|
|
|
||
|
|
Non-actionable findings (nitpicks, patterns, style preferences) get posted to a queryable knowledge base (Forge/OpenBrain). When patterns emerge, humans create issues.
|
||
|
|
|
||
|
|
```
|
||
|
|
Build → Agents review → Actionable → Fix immediately
|
||
|
|
→ Non-actionable → Post to knowledge base
|
||
|
|
→ Patterns emerge
|
||
|
|
→ Human creates Issue
|
||
|
|
→ Agent picks up via pipeline
|
||
|
|
```
|
||
|
|
|
||
|
|
### Discussion Categories
|
||
|
|
|
||
|
|
| Channel | Category | Purpose |
|
||
|
|
|---------|----------|---------|
|
||
|
|
| 🚧 dev | PR build findings | Per-PR QA findings |
|
||
|
|
| 🛩️ alpha | Canary findings | Early testing |
|
||
|
|
| 🛸 beta | Integration findings | Integration testing |
|
||
|
|
| 🚀 stable | Release audit | Production audit |
|
||
|
|
|
||
|
|
### Naming: `{tool}:v{VERSION}`
|
||
|
|
|
||
|
|
`qa:v0.0.4.pr.264`, `lint:v0.0.4-alpha.42`, `audit:v0.0.4`
|
||
|
|
|
||
|
|
Tool prefixes: `qa:`, `lint:`, `static:`, `docker:`, `e2e:`, `perf:`, `security:`, `audit:`
|
||
|
|
|
||
|
|
### Pattern Detection
|
||
|
|
|
||
|
|
Query discussions to surface patterns across builds:
|
||
|
|
```bash
|
||
|
|
# 47 aria-label mentions across dev discussions → time for a11y audit issue
|
||
|
|
gh api graphql ... | grep -c "aria-label"
|
||
|
|
```
|
||
|
|
|
||
|
|
### CLI Integration
|
||
|
|
|
||
|
|
```bash
|
||
|
|
core go qa --post-findings # Post lint findings to discussion
|
||
|
|
core php qa --post-findings # Same for PHP
|
||
|
|
core qa # Aggregated summary
|
||
|
|
```
|
||
|
|
|
||
|
|
### Connection to Training
|
||
|
|
|
||
|
|
Discussion patterns → Issue → Agent implements → PR merged → findings captured as LEM training data. The feedback loop that makes agents better at conventions over time.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Related RFCs
|
||
|
|
|
||
|
|
- `code/core/agent/flow/` — Flow YAML system
|
||
|
|
- `code/core/agent/RFC.md` — Agent dispatch system
|
||
|
|
- `project/lthn/lem/RFC-TRAINING-PIPELINE.md` — Findings → training data
|