Commit graph

72 commits

Author SHA1 Message Date
Virgil
daf4611bfc feat(agentic): emit rate limit backoff events
Co-Authored-By: Virgil <virgil@lethean.io>
2026-04-01 20:12:28 +00:00
Virgil
1cc318e2e8 feat(agentic): persist runtime backoff state
Co-Authored-By: Virgil <virgil@lethean.io>
2026-04-01 17:54:26 +00:00
Virgil
130b2c84d1 fix(agentic): align workspace mount and repo sync
Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-31 14:54:32 +00:00
Virgil
7d62899b7e fix(agentic): restore RFC completion pipeline
Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-31 06:44:56 +00:00
Virgil
f0c903d8c3 fix(ax): replace typed service lookups with Core.Service
Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-31 05:43:14 +00:00
Virgil
998b6094ca fix(ax): trim remaining comment narration
Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-31 05:34:44 +00:00
Virgil
e8cb1f2d39 fix(ax): trim remaining implementation comments
Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-31 05:28:26 +00:00
Virgil
06ae13d92b fix(ax): align remaining comments with AX principles
Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-31 04:33:36 +00:00
Virgil
326c046d4e fix(ax): align remaining comments and names
Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-30 23:01:42 +00:00
Virgil
bd12c0a31a fix(ax): align code comments with AX principles
Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-30 22:54:19 +00:00
Virgil
877de43257 fix(ax): refine harvest and dispatch naming
Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-30 22:46:21 +00:00
Virgil
b8e948f2d6 fix(ax): continue AX naming cleanup
Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-30 22:40:28 +00:00
Virgil
da25b6f79f fix(ax): continue AX naming cleanup
Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-30 21:37:15 +00:00
Virgil
e825550a90 fix(ax): continue AX naming cleanup
Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-30 21:30:49 +00:00
Virgil
1cc8fb50e1 fix(ax): continue workspace naming cleanup
Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-30 21:22:54 +00:00
Virgil
f11d1d47a1 fix(ax): continue AX naming cleanup
Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-30 21:11:06 +00:00
Virgil
7d6fd8af5e fix(ax): make agentic status reads Result-native
Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-30 20:40:31 +01:00
Virgil
f3b70fcdfb fix(ax): use Result-native status writes
Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-30 20:33:12 +01:00
Virgil
faf6b8b6fb fix(ax): unify home-path resolution
Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-30 19:54:17 +01:00
Virgil
d4c82ccd56 fix(ax): extract dispatch completion monitor
Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-30 17:02:28 +00:00
Virgil
6d239d5b95 fix(ax): remove pid syscall wrappers
Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-30 16:01:32 +00:00
Virgil
3c2575f45b fix(ax): remove proc.go wrapper layer
Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-30 15:48:21 +00:00
Virgil
a2ff1ce19a fix(agentic): repair AX refactor regressions
Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-30 14:01:43 +00:00
Virgil
3d2fa035a9 refactor(agentic): migrate workflow helpers to core.Result
Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-30 07:30:42 +00:00
Virgil
65970527e8 fix(ax): centralise workspace path helpers
Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-29 21:11:46 +00:00
Virgil
6ac195c2e6 fix(agentic): align workspace flow with AX design
Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-29 20:15:58 +00:00
Snider
6bb4fb8d57 fix(dispatch): concurrency, queue runner, and path improvements
Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-29 20:40:20 +01:00
Snider
53db749738 fix(runner): reserve slot on approval to prevent TOCTOU race
Runner now creates a reservation entry (PID=-1) in the workspace Registry
immediately when approving a dispatch. This prevents parallel requests
from all seeing count < limit before any spawn completes.

Reservations are counted by countRunningByAgent/ByModel (PID < 0 = always
count). Agentic overwrites with real PID via TrackWorkspace after spawn.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-26 11:23:04 +00:00
Snider
0fc6eeb4cc feat(runner): extract dispatch runner into independent Core service
Moves concurrency, queue drain, workspace lifecycle, and frozen state
from agentic/prep into pkg/runner/ — a standalone Core service that
communicates via IPC Actions only.

- runner.Register wires Actions: dispatch, status, start, stop, kill, poke
- runner.HandleIPCEvents catches AgentCompleted → ChannelPush + queue poke
- Agentic dispatch asks runner for permission via c.Action("runner.dispatch")
- Dispatch mutex moved to struct-level sync.Mutex (fixes core.Lock init race)
- Registry-based concurrency counting replaces disk scanning
- TrackWorkspace called on both queued and running status writes
- SpawnQueued message added for runner→agentic spawn requests
- ChannelPush message in core/mcp enables any service to push channel events
- 51 new tests covering runner service, queue, and config parsing

Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-26 11:00:47 +00:00
Snider
ba281a2a2d feat: clone workspace dependencies + Docker cleanup
- cloneWorkspaceDeps: reads go.mod, clones Core ecosystem modules from Forge
  into workspace alongside ./repo, rebuilds go.work with all use directives
- Deduplicates deps (dappco.re + forge.lthn.ai map to same repos)
- Container chmod: workspace files made writable before exit so host can clean up
- GONOSUMCHECK for local workspace modules (bypass checksum for dev branches)
- Removed stale OLLAMA_HOST env from container

Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-26 07:37:43 +00:00
Snider
19d849aa75 fix: use Core PerformAsync for agent monitoring — app stays alive
The monitoring goroutine that waits for agent completion was a raw `go func()`
that Core didn't know about. ServiceShutdown killed it immediately on CLI exit.

Now uses PerformAsync which registers with Core's WaitGroup:
- ServiceShutdown waits for all async tasks to drain
- `core-agent workspace dispatch` blocks until agent completes
- Agent lifecycle properly tracked by the framework

Also whitelist agentic.monitor.* and agentic.complete in entitlement checker.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-26 07:16:48 +00:00
Snider
b57be6eb91 fix: spawnAgent uses ServiceFor instead of global process.Default()
The global process.StartWithOptions() requires process.SetDefault() which
was never called. Use core.ServiceFor[*process.Service] to get the registered
service instance directly — same code path, proper Core wiring.

Fixes: dispatch failing with "failed to spawn codex" on CLI dispatch.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-26 06:47:44 +00:00
Snider
537226bd4d feat: AX v0.8.0 upgrade — Core features + quality gates
AX Quality Gates (RFC-025):
- Eliminate os/exec from all test + production code (12+ files)
- Eliminate encoding/json from all test files (15 files, 66 occurrences)
- Eliminate os from all test files except TestMain (Go runtime contract)
- Eliminate path/filepath, net/url from all files
- String concat: 39 violations replaced with core.Concat()
- Test naming AX-7: 264 test functions renamed across all 6 packages
- Example test 1:1 coverage complete

Core Features Adopted:
- Task Composition: agent.completion pipeline (QA → PR → Verify → Ingest → Poke)
- PerformAsync: completion pipeline runs with WaitGroup + progress tracking
- Config: agents.yaml loaded once, feature flags (auto-qa/pr/merge/ingest)
- Named Locks: c.Lock("drain") for queue serialisation
- Registry: workspace state with cross-package QUERY access
- QUERY: c.QUERY(WorkspaceQuery{Status: "running"}) for cross-service queries
- Action descriptions: 25+ Actions self-documenting
- Data mounts: prompts/tasks/flows/personas/workspaces via c.Data()
- Content Actions: agentic.prompt/task/flow/persona callable via IPC
- Drive endpoints: forge + brain registered with tokens
- Drive REST helpers: DriveGet/DrivePost/DriveDo for Drive-aware HTTP
- HandleIPCEvents: auto-discovered by WithService (no manual wiring)
- Entitlement: frozen-queue gate on write Actions
- CLI dispatch: workspace dispatch wired to real dispatch method
- CLI: --quiet/-q and --debug/-d global flags
- CLI: banner, version, check (with service/action/command counts), env
- main.go: minimal — 5 services + c.Run(), no os import
- cmd tests: 84.2% coverage (was 0%)

Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-26 06:38:02 +00:00
Snider
f83c753277 feat(v0.8.0): full AX migration — ServiceRuntime, Actions, quality gates, transport
go-process:
- Register factory, Result lifecycle, 5 named Action handlers
- Start/Run/StartWithOptions/RunWithOptions all return core.Result
- core.ID() replaces fmt.Sprintf, core.As replaces errors.As

core/agent:
- PrepSubsystem + monitor.Subsystem + setup.Service embed ServiceRuntime[T]
- 22 named Actions + agent.completion Task pipeline in OnStartup
- ChannelNotifier removed — all IPC via c.ACTION(messages.X{})
- proc.go: all methods via s.Core().Process(), returns core.Result
- status.go: WriteAtomic + JSONMarshalString
- paths.go: Fs.NewUnrestricted() replaces unsafe.Pointer
- transport.go: ONE net/http file — HTTPGet/HTTPPost/HTTPDo/MCP transport
- All disallowed imports eliminated from source files (13 quality gates)
- String concat eliminated — core.Concat() throughout
- 1:1 _test.go + _example_test.go for every source file
- Reference docs synced from core/go v0.8.0
- RFC-025 updated with net/http, net/url, io/fs quality gates
- lib.go: io/fs eliminated via Data.ListNames, Array[T].Deduplicate

Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-26 01:27:46 +00:00
Snider
4eb1111faa refactor: clean up proc.go — ensureProcess bridge, processIsRunning/processKill helpers
- proc.go: ensureProcess() as temporary bridge until go-process gets v0.7.0 update
- processIsRunning/processKill: use go-process ProcessID when available, fall back to PID
- WorkspaceStatus: add ProcessID field for go-process managed lookup
- dispatch.go: simplified spawnAgent goroutine — uses proc.Done() instead of syscall poll
- Removed syscall import from dispatch.go

Next: update go-process to v0.7.0 Core contract, then replace
syscall.Kill calls in queue.go, shutdown.go, status.go, dispatch_sync.go
with processIsRunning/processKill.

Coverage: 78.1%, 802 tests

Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-25 10:10:17 +00:00
Snider
8521a55907 refactor: eliminate os/exec from all source files → go-process
Replace ALL exec.Command calls with proc.go helpers backed by go-process:
- runCmd/runCmdEnv/runCmdOK — general command execution
- gitCmd/gitCmdOK/gitOutput — git-specific helpers
- ensureProcess() — lazy default service init

Refactored files (0 source files import os/exec now):
- verify.go: runGoTests, runPHPTests, runNodeTests, rebaseBranch
- dispatch.go: runQA (3 exec.Command chains → runCmdOK)
- prep.go: getGitLog, prepWorkspace clone/branch
- pr.go: createPR branch detection + push
- auto_pr.go: commit counting + push
- mirror.go: all git ops + gh CLI calls
- review_queue.go: pushAndMerge, buildReviewCommand (returns string+args now)
- paths.go: DefaultBranch

Coverage: 77.2% → 78.4% (+1.2pp from testable process paths)
802 tests

Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-25 09:51:57 +00:00
Snider
1e12b062dd refactor: decompose spawnAgent monolith — agentic 67.8%, 546 tests
Extract 7 functions from 150-line spawnAgent goroutine:
- detectFinalStatus: BLOCKED.md + exit code → status/question
- trackFailureRate: fast-failure detection + backoff
- startIssueTracking/stopIssueTracking: Forge stopwatch
- broadcastStart/broadcastComplete: IPC + audit events
- onAgentComplete: orchestrates all post-completion steps
- agentOutputFile: log path helper

spawnAgent is now: build command → start process → broadcast → monitor.
All extracted functions are independently testable.

Coverage: agentic 66.4% → 67.8% (+1.4pp)

Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-25 01:11:04 +00:00
Snider
d9e7fa092b feat: complete DI migration — IPC pipeline + Config + Locks
Phase 4 complete:
- Auto-PR handler emits PRCreated message
- Verify handler listens for PRCreated, emits PRMerged/PRNeedsReview
- findWorkspaceByPR() for workspace lookup from PR events
- Remove legacy inline fallback from dispatch goroutine

Phase 5 complete:
- agents.yaml loaded once at startup into c.Config()
- canDispatchAgent reads from c.Config() (no re-parsing)
- drainQueue uses c.Lock("drain") when Core available

Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-24 16:44:19 +00:00
Snider
4e69daf2da feat: replace initServices() with core.New() service conclave
Services are now registered during Core construction:
  core.New(
      core.WithService(agentic.Register),
      core.WithService(monitor.Register),
      core.WithService(brain.Register),
  )

- Remove initServices() closure — services created once in conclave
- Commands use c.ServiceStartup()/c.ServiceShutdown() for lifecycle
- Service instances retrieved via c.Config() for MCP tool registration
- run/orchestrator reduced to ServiceStartup + block + ServiceShutdown
- run/task uses conclave's agentic instance

Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-24 16:33:04 +00:00
Snider
cdb29a2f75 feat(ipc): replace CompletionNotifier callbacks with Core IPC messages
Phase 3 of Core DI migration:
- Remove CompletionNotifier interface from pkg/agentic
- dispatch.go emits messages.AgentStarted/AgentCompleted via c.ACTION()
- monitor registers IPC handlers in SetCore() — handleAgentStarted/handleAgentCompleted
- Remove circular callback wiring (SetCompletionNotifier) from main.go
- Export ReadStatus for cross-package use
- Update run/orchestrator to use SetCore() instead of SetCompletionNotifier()

Services now communicate through typed messages, not direct references.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-24 14:46:59 +00:00
Snider
04e3d492e9 fix(monitor): emit agent.completed per task, verify PIDs for queue.drained
- Export ReadStatus (was readStatus) for cross-package use
- AgentCompleted now emits agent.completed with repo/agent/workspace/status
  for every finished task, not just failures
- queue.drained only fires when genuinely empty — verified by checking
  PIDs are alive via kill(0), not just trusting stale status files
- Fix Docker mount paths: /root/ → /home/dev/ for non-root container
- Update all callers and tests

Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-24 13:02:41 +00:00
Snider
9bdd47d9d5 feat(agent): v0.3.0 — dispatch control, run task CLI, quiet notifications, spark pool
- Add agentic_dispatch_start / shutdown / shutdown_now MCP tools
- Queue frozen by default, CORE_AGENT_DISPATCH=1 to auto-start
- Add run task CLI command — single task e2e (prep → spawn → wait)
- Add DispatchSync for blocking dispatch without MCP
- Quiet notifications — only agent.failed and queue.drained events
- Remove duplicate notification paths (direct callback + polling loop)
- codex-spark gets separate concurrency pool (baseAgent routing)
- Rate-limit backoff detection (3 fast failures → 30min pause)
- Review agent uses exec with sandbox bypass (not codex review)
- Bump: core-agent 0.3.0, core plugin 0.15.0, devops plugin 0.2.0

Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-23 16:08:08 +00:00
Snider
6d4b92737e feat(agent): background runner, slim status, Docker dispatch, stopwatch, CLI fixes
- Add background queue runner (runner.go) — 30s tick + poke on completion
- drainQueue now loops to fill all slots per tick
- Add run orchestrator command — standalone queue runner without MCP
- Slim agentic_status — stats only, blocked workspaces listed
- Docker containerised dispatch — all agents run in core-dev container
- Forge stopwatch start/stop on issue when agent starts/completes
- issue create supports --milestone, --assignee, --ref
- Auto-PR targets dev branch (not main)
- PR body includes Closes #N for issue-linked work
- CLI usage strings use spaces not slashes
- Review agent uses exec with sandbox bypass (not codex review subcommand)
- Local model support via codex --oss with socat Ollama proxy

Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-23 12:53:33 +00:00
Snider
9b225af2f9 refactor(monitor): direct event push, no filesystem polling
CompletionNotifier interface now has AgentStarted() and
AgentCompleted() instead of Poke(). Dispatch pushes notifications
directly to monitor with agent/repo/status data. Monitor pushes
MCP channel events immediately — no scanning, no dedup maps,
no filesystem polling latency.

Events.jsonl kept as audit log only, not notification mechanism.
Timer-based scan kept for startup seeding and stale detection.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-22 16:19:13 +00:00
Snider
4bcc04d890 feat(monitor): agent.started + agent.complete channel notifications
- emitStartEvent fires when agent spawns (dispatch.go)
- Monitor detects new "running" workspaces and pushes agent.started
  channel notification with repo and agent info
- agent.complete already included blocked/failed status — no change
- Both old and new workspace layouts supported

Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-22 15:45:32 +00:00
Snider
69f0acce0d fix(dispatch): model flag splice broke -o argument ordering
When model variant was specified, the splice inserted --model
between -o and its value, making Codex see -o with no file path.
Fixed by appending --model and prompt sequentially instead of
splicing into the middle of the args slice.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-22 15:45:16 +00:00
Snider
c73ea6ad7a fix(dispatch): strip model variant from log filename
Agent name like "codex:gpt-5.3-codex-spark" contains a colon which
breaks file paths. Use base name (before ":") for the log file.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-22 15:45:16 +00:00
Snider
d6864eb37c feat(dispatch): QA gate before auto-PR
After agent completes, run build + vet + test before creating PR.
If QA fails, mark workspace as failed with "QA check failed" —
bad code never gets PR'd.

Supports Go (build/vet/test), PHP (composer install/test),
and Node (npm install/test). Unknown languages pass through.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-22 15:45:16 +00:00
Snider
6e03287178 refactor(agentic): workspace = clone, prompt replaces files
Major simplification of the dispatch model:
- Workspace dir: .core/workspace/{org}/{repo}/{pr|task|branch|tag}/
- Clone into repo/ (not src/), metadata in .meta/
- One of issue, pr, branch, or tag required for dispatch
- All context (brain, consumers, git log, wiki, plan) assembled
  into prompt string — no TODO.md, PROMPT.md, CONTEXT.md files
- Resume detection: skip clone if repo/.git exists
- Default agent changed to codex
- spawnAgent drops srcDir param, runs from repo/
- No --skip-git-repo-check (repo/ IS a git repo)
- All downstream files: srcDir → repoDir

Track PRs, not workspace iterations.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-22 15:45:16 +00:00
Snider
6e37bd22f0 feat: devops plugin, CLI commands, Codex dispatch fixes, AX sweep
DevOps plugin (5 skills):
- install-core-agent, repair-core-agent, merge-workspace,
  update-deps, clean-workspaces

CLI commands: version, check, extract for diagnostics.

Codex dispatch: --skip-git-repo-check, removed broken
--model-reasoning-effort, --sandbox workspace-write via
--full-auto. Workspace template extracts to wsDir not srcDir.

AX sweep (Codex-generated): sanitise.go extracted from prep/plan,
mirror.go JSON parsing via encoding/json, setup/config.go URL
parsing via net/url, strings/fmt imports eliminated from setup.

CODEX.md template updated with Env/Path patterns.
Review workspace template with audit-only PROMPT.md.
Marketplace updated with devops plugin.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-22 15:45:16 +00:00