Replace internal task tracking (TODO.md, FINDINGS.md) with structured documentation in docs/. Trim CLAUDE.md to agent instructions only. Co-Authored-By: Virgil <virgil@lethean.io>
16 KiB
go-agentic Architecture
Module: forge.lthn.ai/core/go-agentic
This document describes the internal structure of go-agentic: how components relate, what interfaces they implement, and how data flows through the system.
Overview
go-agentic is an AI service lifecycle and task management library. It provides:
- A Core framework service that spawns
claudesubprocess invocations - Quota enforcement (allowance management) with three storage backends
- An agent registry with two persistence backends
- A task router and dispatcher for multi-agent coordination
- Event hooks for lifecycle notification
- CLI-oriented helper functions for status display, task submission, and log streaming
The package has no dependency on go-ai, go-ml, go-rag, or go-mcp. Its only non-standard imports are forge.lthn.ai/core/go (the Core DI framework), forge.lthn.ai/core/go-store (SQLite KV), and github.com/redis/go-redis/v9.
Service Lifecycle
Service wraps framework.ServiceRuntime[ServiceOptions] and integrates with the Core dependency injection container.
type Service struct {
*framework.ServiceRuntime[ServiceOptions]
}
On startup, OnStartup registers a task handler with the Core runtime:
func (s *Service) OnStartup(ctx context.Context) error {
s.Core().RegisterTask(s.handleTask)
return nil
}
The handler dispatches on task type:
TaskCommit— runs the embeddedprompts/commit.mdprompt viaclaude -p <prompt> --allowedTools Bash,Read,Glob,Grepin the specified directory. IfCanEditis true,WriteandEditare added to the tool list.TaskPrompt— runs an arbitrary prompt. Allowed tools default toServiceOptions.DefaultTools(Bash, Read, Glob, Grep) but can be overridden per task. Reports progress viaCore().Progress()using the task ID.
Both task types spawn claude as a subprocess with os.Stdout/os.Stderr/os.Stdin connected directly, so output streams to the terminal.
ServiceOptions
type ServiceOptions struct {
DefaultTools []string // default tool list for TaskPrompt
AllowEdit bool // global permission for Write/Edit tools
}
Default tools are ["Bash", "Read", "Glob", "Grep"]. Write/Edit require explicit opt-in either globally or per task.
Allowance Management
Allowance management enforces per-agent and per-model token quotas. It is composed of three layers:
AllowanceStore— persistence interfaceAllowanceService— enforcement logic- Three
AllowanceStoreimplementations — Memory, SQLite, Redis
AllowanceStore Interface
type AllowanceStore interface {
GetAllowance(agentID string) (*AgentAllowance, error)
SetAllowance(a *AgentAllowance) error
GetUsage(agentID string) (*UsageRecord, error)
IncrementUsage(agentID string, tokens int64, jobs int) error
DecrementActiveJobs(agentID string) error
ReturnTokens(agentID string, tokens int64) error
ResetUsage(agentID string) error
GetModelQuota(model string) (*ModelQuota, error)
GetModelUsage(model string) (int64, error)
IncrementModelUsage(model string, tokens int64) error
}
All three implementations satisfy this interface with identical semantics. The AgentAllowance struct holds per-agent limits; UsageRecord holds current period usage.
MemoryStore
In-memory implementation using sync.RWMutex. State is lost on process restart. Suitable for single-process use, testing, and development. Uses defensive copy semantics: Get and Set return copies of stored structs to prevent aliasing.
SQLiteStore
Persistent single-node implementation using forge.lthn.ai/core/go-store (a SQLite KV abstraction over modernc.org/sqlite). Data is namespaced into four groups: allowances, usage, model_quotas, model_usage. Read-modify-write operations are serialised with a sync.Mutex. WAL mode is enabled for concurrent reads. time.Duration values are serialised as int64 nanoseconds to avoid locale-dependent string parsing.
Default database path: ~/.config/agentic/allowance.db.
RedisStore
Multi-process persistent implementation using github.com/redis/go-redis/v9. Suitable for deployments where multiple agent processes share quota state. Atomic read-modify-write is implemented via Lua scripts registered on the Redis server (EVAL). Key pattern: {prefix}:{type}:{id}. Default prefix: agentic. Pings the server at construction time.
Backend Selection
NewAllowanceStoreFromConfig(cfg AllowanceConfig) is the factory:
type AllowanceConfig struct {
StoreBackend string // "memory" | "sqlite" | "redis"
StorePath string // SQLite path (optional)
RedisAddr string // host:port (optional)
}
An empty or unrecognised StoreBackend defaults to "memory".
AllowanceService
AllowanceService wraps an AllowanceStore and enforces quota policy. The check order in Check(agentID, model string) is:
- Model allowlist — rejects if agent's
ModelAllowlistis non-empty and the model is not in it - Daily token limit — rejects if
TokensUsed >= DailyTokenLimit; emitsEventQuotaWarningat 80% usage - Daily job limit — rejects if
JobsStarted >= DailyJobLimit - Concurrent jobs — rejects if
ActiveJobs >= ConcurrentJobs - Global model budget — rejects if model-level
TokensUsed >= DailyTokenBudget
When multiple limits are simultaneously exceeded, the first in this order is reported as the reason.
RecordUsage(report UsageReport) handles four quota events:
| Event | Effect |
|---|---|
QuotaEventJobStarted |
Increments JobsStarted and ActiveJobs by 1 |
QuotaEventJobCompleted |
Adds TokensIn + TokensOut to TokensUsed; decrements ActiveJobs; records model usage |
QuotaEventJobFailed |
Records full token usage; returns 50% of tokens; decrements ActiveJobs |
QuotaEventJobCancelled |
Returns 100% of tokens; decrements ActiveJobs; no model usage recorded |
Two fields on ModelQuota — HourlyRateLimit and CostCeiling — are stored and round-tripped by all backends but are not currently enforced by AllowanceService.Check. Enforcement requires a sliding-window GetHourlyUsage method on AllowanceStore, which would be a breaking interface change. The fields are annotated as reserved in the source.
Agent Registry
AgentRegistry tracks the set of known agents and their health state.
Interface
type AgentRegistry interface {
Register(agent AgentInfo) error
Deregister(id string) error
Get(id string) (AgentInfo, error)
List() []AgentInfo
Heartbeat(id string) error
Reap(ttl time.Duration) []string
}
Reap transitions agents whose LastHeartbeat is older than ttl to AgentOffline and returns their IDs. Heartbeat transitions an AgentOffline agent back to AgentAvailable.
MemoryRegistry
In-memory implementation with sync.RWMutex. Uses copy-on-read semantics consistent with MemoryStore.
SQLiteRegistry
Persistent implementation backed by a database/sql SQLite connection (using modernc.org/sqlite directly, not via go-store). Schema:
CREATE TABLE agents (
id TEXT PRIMARY KEY,
name TEXT NOT NULL DEFAULT '',
capabilities TEXT NOT NULL DEFAULT '[]',
status TEXT NOT NULL DEFAULT 'available',
last_heartbeat DATETIME NOT NULL DEFAULT (datetime('now')),
current_load INTEGER NOT NULL DEFAULT 0,
max_load INTEGER NOT NULL DEFAULT 0,
registered_at DATETIME NOT NULL DEFAULT (datetime('now'))
)
Register uses INSERT ... ON CONFLICT(id) DO UPDATE (UPSERT). WAL mode and busy_timeout=5000ms are set at open time. Capabilities are stored as a JSON array string.
RedisRegistry
TTL-based implementation where each agent is stored as a JSON value at key {prefix}:agent:{id}. Natural key expiry serves as the reap mechanism. Heartbeat refreshes the key's TTL. Reap performs an explicit SCAN for expired keys as a backup to natural expiry. Uses WithRedisPassword, WithRedisDB, and WithRedisPrefix functional options.
Backend Selection
NewAgentRegistryFromConfig(cfg RegistryConfig) mirrors the allowance factory:
type RegistryConfig struct {
RegistryBackend string // "memory" | "sqlite" | "redis"
RegistryPath string // SQLite path (optional)
RegistryRedisAddr string // host:port (optional)
}
Default registry path: ~/.config/agentic/registry.db.
Task Queue and Dispatcher
Task Model
Task is the core data type:
type Task struct {
ID string
Title string
Description string
Priority TaskPriority // critical | high | medium | low
Status TaskStatus // pending | in_progress | completed | blocked | failed
Labels []string // capability requirements for routing
Files []string // relevant files for context building
Dependencies []string // task IDs that must complete first
MaxRetries int // 0 uses DefaultMaxRetries (3)
RetryCount int // failed dispatch attempts so far
LastAttempt *time.Time // when the last dispatch attempt occurred
FailReason string // set when StatusFailed
// ... timestamps, ClaimedBy, Project, etc.
}
TaskRouter
DefaultRouter selects an agent using two strategies:
- For
PriorityCriticaltasks: picks the agent with the lowestCurrentLoad, tie-broken alphabetically by agent ID. - For all other tasks: scores agents as
1.0 - (CurrentLoad / MaxLoad)and picks the highest scorer. Agents withMaxLoad == 0score 1.0 (unlimited capacity).
Eligibility filtering runs first: only agents with status AgentAvailable or AgentBusy with remaining capacity are considered. The task's Labels field maps to capability requirements — all labels must be present in the agent's Capabilities slice.
Returns ErrNoEligibleAgent when no agent qualifies.
Dispatcher
Dispatcher combines AgentRegistry, TaskRouter, AllowanceService, and Client:
Dispatch(ctx, task):
1. registry.List() — get candidate agents
2. router.Route(task, agents) — select best agent
3. allowance.Check(agentID) — verify quota
4. client.ClaimTask(task.ID) — mark in-progress via API
5. allowance.RecordUsage(JobStarted) — record job start
→ emit EventTaskDispatched
Failure paths emit EventDispatchFailedNoAgent or EventDispatchFailedQuota.
DispatchLoop
DispatchLoop(ctx, interval) polls client.ListTasks(pending) at the given interval and dispatches each task. Key behaviours:
- Tasks are sorted by priority (Critical first) then by
CreatedAt(oldest first) before dispatch. - Tasks with
RetryCount > 0are skipped ifLastAttempt + backoff(RetryCount) > now. Backoff is exponential starting at 5 seconds (5s, 10s, 20s, ...). - Tasks that fail dispatch have
RetryCountincremented. Tasks reachingMaxRetries(default 3) are updated toStatusFailedvia the API (dead-lettered) and emitEventTaskDeadLettered. - Transient API errors listing tasks are logged and the loop continues.
- The loop exits when the context is cancelled and returns
ctx.Err().
Context Builder
BuildTaskContext(task, dir) gathers context for AI collaboration:
- Reads files explicitly listed in
task.Filesfrom the working directory. - Runs
git status --porcelainandgit log --oneline -10. - Extracts keywords from the task title and description (stop-word filtered, minimum 3 chars, up to 5 keywords) and runs
git grep -l -i <keyword> -- *.go *.ts *.js *.pyfor each. - Reads up to 10 matching source files, truncating files over 5,000 bytes.
TaskContext.FormatContext() renders the collected data as a Markdown string with sections for task metadata, task files, git status, recent commits, and related code — suitable for inclusion in a prompt sent to an AI model.
GatherRelatedFiles(task, dir) is the public sub-function that reads only the explicitly listed files.
CLI Client
Client is an HTTP client for the core-agentic REST API. It sends Bearer token authentication on every request.
Endpoints used:
| Method | Path | Client function |
|---|---|---|
| GET | /api/tasks | ListTasks |
| GET | /api/tasks/{id} | GetTask |
| POST | /api/tasks | CreateTask |
| POST | /api/tasks/{id}/claim | ClaimTask |
| PATCH | /api/tasks/{id} | UpdateTask |
| POST | /api/tasks/{id}/complete | CompleteTask |
| GET | /api/health | Ping |
ClaimTask accepts both ClaimResponse{"task": ...} and bare Task JSON to handle varying API implementations. checkResponse parses error bodies as APIError{code, message, details} or falls back to a generic HTTP status message.
Default timeout: 30 seconds. Default base URL: http://localhost:8080 (override with AGENTIC_BASE_URL).
Configuration
LoadConfig(dir) resolves configuration in this order:
.envfile in the specified directory (or current directory ifdiris empty), readingAGENTIC_BASE_URL,AGENTIC_TOKEN,AGENTIC_PROJECT,AGENTIC_AGENT_ID.~/.core/agentic.yamlwith the same fields in YAML form.- Environment variable overrides applied on top of whichever file was loaded.
A missing token always returns an error. SaveConfig writes to ~/.core/agentic.yaml, creating the directory if needed.
Event Hooks
All lifecycle events are published through the EventEmitter interface:
type EventEmitter interface {
Emit(ctx context.Context, event Event) error
}
Event carries a typed string EventType, optional TaskID and AgentID, a Timestamp, and an untyped Payload.
Event Types
| Type | Emitter | Trigger |
|---|---|---|
task_dispatched |
Dispatcher | Successful routing, quota check, and claim |
task_claimed |
Dispatcher | API claim call succeeded |
dispatch_failed_no_agent |
Dispatcher | ErrNoEligibleAgent from router |
dispatch_failed_quota |
Dispatcher | Allowance check denied |
task_dead_lettered |
Dispatcher | Task exceeded MaxRetries |
quota_warning |
AllowanceService | Agent at 80%+ of daily token limit |
quota_exceeded |
AllowanceService | Any limit check fails |
usage_recorded |
AllowanceService | JobStarted or JobCompleted events |
Implementations
ChannelEmitter — buffered chan Event. Drops events silently if the buffer is full to avoid blocking the dispatch path. Default buffer size: 64. Use Events() to receive.
MultiEmitter — fans out to a slice of emitters. Failures in one emitter do not stop the others. Thread-safe: emitters can be added with Add() at any time.
Both Dispatcher and AllowanceService accept an emitter via SetEventEmitter(). A nil emitter is safe — all emit calls are no-ops.
CLI Backing Functions
These functions are intended to be called by core agent CLI commands implemented in the core/cli repository.
GetStatus(ctx, registry, client, allowanceSvc) — aggregates registry.List(), counts of pending and in-progress tasks from the client, and remaining tokens per agent from the allowance service. Any argument can be nil; those sections are skipped. Returns *StatusSummary.
FormatStatus(summary) — renders a StatusSummary as a human-readable table with agent rows sorted alphabetically by ID.
SubmitTask(ctx, client, title, description, labels, priority) — validates that title is non-empty, constructs a Task with StatusPending and current timestamp, and calls client.CreateTask.
StreamLogs(ctx, client, taskID, interval, writer) — polls client.GetTask at the given interval, writing timestamped status lines to writer. Stops on StatusCompleted or StatusBlocked. Transient errors are written to the stream but do not stop polling. Returns ctx.Err() on cancellation.
Prompt Templates
embed.go embeds all files under prompts/*.md at compile time. Prompt(name) reads a template by name (without .md), trims whitespace, and returns an empty string if the file does not exist. Currently one template is provided: prompts/commit.md, used by TaskCommit to guide Claude through generating a conventional commit.
Dependency Graph
go-agentic
forge.lthn.ai/core/go (ServiceRuntime, framework, io, log)
forge.lthn.ai/core/go-store (SQLite KV — SQLiteStore)
github.com/redis/go-redis/v9 (RedisStore, RedisRegistry)
gopkg.in/yaml.v3 (config file parsing)
modernc.org/sqlite (SQLiteRegistry — direct database/sql)
github.com/stretchr/testify (tests only)
The module uses Go workspace replace directives to reference ../go and ../go-store during development.