Snider 6e8ae2a1ec docs: graduate TODO/FINDINGS into production documentation

Replace internal task tracking (TODO.md, FINDINGS.md) with structured
documentation in docs/. Trim CLAUDE.md to agent instructions only.

Co-Authored-By: Virgil <virgil@lethean.io>

2026-02-20 15:02:20 +00:00

16 KiB

Raw Blame History

go-agentic Architecture

Module: forge.lthn.ai/core/go-agentic

This document describes the internal structure of go-agentic: how components relate, what interfaces they implement, and how data flows through the system.

Overview

go-agentic is an AI service lifecycle and task management library. It provides:

A Core framework service that spawns claude subprocess invocations
Quota enforcement (allowance management) with three storage backends
An agent registry with two persistence backends
A task router and dispatcher for multi-agent coordination
Event hooks for lifecycle notification
CLI-oriented helper functions for status display, task submission, and log streaming

The package has no dependency on go-ai, go-ml, go-rag, or go-mcp. Its only non-standard imports are forge.lthn.ai/core/go (the Core DI framework), forge.lthn.ai/core/go-store (SQLite KV), and github.com/redis/go-redis/v9.

Service Lifecycle

Service wraps framework.ServiceRuntime[ServiceOptions] and integrates with the Core dependency injection container.

type Service struct {
    *framework.ServiceRuntime[ServiceOptions]
}

On startup, OnStartup registers a task handler with the Core runtime:

func (s *Service) OnStartup(ctx context.Context) error {
    s.Core().RegisterTask(s.handleTask)
    return nil
}

The handler dispatches on task type:

TaskCommit — runs the embedded prompts/commit.md prompt via claude -p <prompt> --allowedTools Bash,Read,Glob,Grep in the specified directory. If CanEdit is true, Write and Edit are added to the tool list.
TaskPrompt — runs an arbitrary prompt. Allowed tools default to ServiceOptions.DefaultTools (Bash, Read, Glob, Grep) but can be overridden per task. Reports progress via Core().Progress() using the task ID.

Both task types spawn claude as a subprocess with os.Stdout/os.Stderr/os.Stdin connected directly, so output streams to the terminal.

ServiceOptions

type ServiceOptions struct {
    DefaultTools []string  // default tool list for TaskPrompt
    AllowEdit    bool      // global permission for Write/Edit tools
}

Default tools are ["Bash", "Read", "Glob", "Grep"]. Write/Edit require explicit opt-in either globally or per task.

Allowance Management

Allowance management enforces per-agent and per-model token quotas. It is composed of three layers:

AllowanceStore — persistence interface
AllowanceService — enforcement logic
Three AllowanceStore implementations — Memory, SQLite, Redis

AllowanceStore Interface

type AllowanceStore interface {
    GetAllowance(agentID string) (*AgentAllowance, error)
    SetAllowance(a *AgentAllowance) error
    GetUsage(agentID string) (*UsageRecord, error)
    IncrementUsage(agentID string, tokens int64, jobs int) error
    DecrementActiveJobs(agentID string) error
    ReturnTokens(agentID string, tokens int64) error
    ResetUsage(agentID string) error
    GetModelQuota(model string) (*ModelQuota, error)
    GetModelUsage(model string) (int64, error)
    IncrementModelUsage(model string, tokens int64) error
}

All three implementations satisfy this interface with identical semantics. The AgentAllowance struct holds per-agent limits; UsageRecord holds current period usage.

MemoryStore

In-memory implementation using sync.RWMutex. State is lost on process restart. Suitable for single-process use, testing, and development. Uses defensive copy semantics: Get and Set return copies of stored structs to prevent aliasing.

SQLiteStore

Persistent single-node implementation using forge.lthn.ai/core/go-store (a SQLite KV abstraction over modernc.org/sqlite). Data is namespaced into four groups: allowances, usage, model_quotas, model_usage. Read-modify-write operations are serialised with a sync.Mutex. WAL mode is enabled for concurrent reads. time.Duration values are serialised as int64 nanoseconds to avoid locale-dependent string parsing.

Default database path: ~/.config/agentic/allowance.db.

RedisStore

Multi-process persistent implementation using github.com/redis/go-redis/v9. Suitable for deployments where multiple agent processes share quota state. Atomic read-modify-write is implemented via Lua scripts registered on the Redis server (EVAL). Key pattern: {prefix}:{type}:{id}. Default prefix: agentic. Pings the server at construction time.

Backend Selection

NewAllowanceStoreFromConfig(cfg AllowanceConfig) is the factory:

type AllowanceConfig struct {
    StoreBackend string  // "memory" | "sqlite" | "redis"
    StorePath    string  // SQLite path (optional)
    RedisAddr    string  // host:port (optional)
}

An empty or unrecognised StoreBackend defaults to "memory".

AllowanceService

AllowanceService wraps an AllowanceStore and enforces quota policy. The check order in Check(agentID, model string) is:

Model allowlist — rejects if agent's ModelAllowlist is non-empty and the model is not in it
Daily token limit — rejects if TokensUsed >= DailyTokenLimit; emits EventQuotaWarning at 80% usage
Daily job limit — rejects if JobsStarted >= DailyJobLimit
Concurrent jobs — rejects if ActiveJobs >= ConcurrentJobs
Global model budget — rejects if model-level TokensUsed >= DailyTokenBudget

When multiple limits are simultaneously exceeded, the first in this order is reported as the reason.

RecordUsage(report UsageReport) handles four quota events:

Event	Effect
`QuotaEventJobStarted`	Increments `JobsStarted` and `ActiveJobs` by 1
`QuotaEventJobCompleted`	Adds `TokensIn + TokensOut` to `TokensUsed`; decrements `ActiveJobs`; records model usage
`QuotaEventJobFailed`	Records full token usage; returns 50% of tokens; decrements `ActiveJobs`
`QuotaEventJobCancelled`	Returns 100% of tokens; decrements `ActiveJobs`; no model usage recorded

Two fields on ModelQuota — HourlyRateLimit and CostCeiling — are stored and round-tripped by all backends but are not currently enforced by AllowanceService.Check. Enforcement requires a sliding-window GetHourlyUsage method on AllowanceStore, which would be a breaking interface change. The fields are annotated as reserved in the source.

Agent Registry

AgentRegistry tracks the set of known agents and their health state.

Interface

type AgentRegistry interface {
    Register(agent AgentInfo) error
    Deregister(id string) error
    Get(id string) (AgentInfo, error)
    List() []AgentInfo
    Heartbeat(id string) error
    Reap(ttl time.Duration) []string
}

Reap transitions agents whose LastHeartbeat is older than ttl to AgentOffline and returns their IDs. Heartbeat transitions an AgentOffline agent back to AgentAvailable.

MemoryRegistry

In-memory implementation with sync.RWMutex. Uses copy-on-read semantics consistent with MemoryStore.

SQLiteRegistry

Persistent implementation backed by a database/sql SQLite connection (using modernc.org/sqlite directly, not via go-store). Schema:

CREATE TABLE agents (
    id             TEXT PRIMARY KEY,
    name           TEXT NOT NULL DEFAULT '',
    capabilities   TEXT NOT NULL DEFAULT '[]',
    status         TEXT NOT NULL DEFAULT 'available',
    last_heartbeat DATETIME NOT NULL DEFAULT (datetime('now')),
    current_load   INTEGER NOT NULL DEFAULT 0,
    max_load       INTEGER NOT NULL DEFAULT 0,
    registered_at  DATETIME NOT NULL DEFAULT (datetime('now'))
)

Register uses INSERT ... ON CONFLICT(id) DO UPDATE (UPSERT). WAL mode and busy_timeout=5000ms are set at open time. Capabilities are stored as a JSON array string.

RedisRegistry

TTL-based implementation where each agent is stored as a JSON value at key {prefix}:agent:{id}. Natural key expiry serves as the reap mechanism. Heartbeat refreshes the key's TTL. Reap performs an explicit SCAN for expired keys as a backup to natural expiry. Uses WithRedisPassword, WithRedisDB, and WithRedisPrefix functional options.

Backend Selection

NewAgentRegistryFromConfig(cfg RegistryConfig) mirrors the allowance factory:

type RegistryConfig struct {
    RegistryBackend   string  // "memory" | "sqlite" | "redis"
    RegistryPath      string  // SQLite path (optional)
    RegistryRedisAddr string  // host:port (optional)
}

Default registry path: ~/.config/agentic/registry.db.

Task Queue and Dispatcher

Task Model

Task is the core data type:

type Task struct {
    ID           string
    Title        string
    Description  string
    Priority     TaskPriority   // critical | high | medium | low
    Status       TaskStatus     // pending | in_progress | completed | blocked | failed
    Labels       []string       // capability requirements for routing
    Files        []string       // relevant files for context building
    Dependencies []string       // task IDs that must complete first
    MaxRetries   int            // 0 uses DefaultMaxRetries (3)
    RetryCount   int            // failed dispatch attempts so far
    LastAttempt  *time.Time     // when the last dispatch attempt occurred
    FailReason   string         // set when StatusFailed
    // ... timestamps, ClaimedBy, Project, etc.
}

TaskRouter

DefaultRouter selects an agent using two strategies:

For PriorityCritical tasks: picks the agent with the lowest CurrentLoad, tie-broken alphabetically by agent ID.
For all other tasks: scores agents as 1.0 - (CurrentLoad / MaxLoad) and picks the highest scorer. Agents with MaxLoad == 0 score 1.0 (unlimited capacity).

Eligibility filtering runs first: only agents with status AgentAvailable or AgentBusy with remaining capacity are considered. The task's Labels field maps to capability requirements — all labels must be present in the agent's Capabilities slice.

Returns ErrNoEligibleAgent when no agent qualifies.

Dispatcher

Dispatcher combines AgentRegistry, TaskRouter, AllowanceService, and Client:

Dispatch(ctx, task):
  1. registry.List()           — get candidate agents
  2. router.Route(task, agents) — select best agent
  3. allowance.Check(agentID)  — verify quota
  4. client.ClaimTask(task.ID) — mark in-progress via API
  5. allowance.RecordUsage(JobStarted) — record job start
  → emit EventTaskDispatched

Failure paths emit EventDispatchFailedNoAgent or EventDispatchFailedQuota.

DispatchLoop

DispatchLoop(ctx, interval) polls client.ListTasks(pending) at the given interval and dispatches each task. Key behaviours:

Tasks are sorted by priority (Critical first) then by CreatedAt (oldest first) before dispatch.
Tasks with RetryCount > 0 are skipped if LastAttempt + backoff(RetryCount) > now. Backoff is exponential starting at 5 seconds (5s, 10s, 20s, ...).
Tasks that fail dispatch have RetryCount incremented. Tasks reaching MaxRetries (default 3) are updated to StatusFailed via the API (dead-lettered) and emit EventTaskDeadLettered.
Transient API errors listing tasks are logged and the loop continues.
The loop exits when the context is cancelled and returns ctx.Err().

Context Builder

BuildTaskContext(task, dir) gathers context for AI collaboration:

Reads files explicitly listed in task.Files from the working directory.
Runs git status --porcelain and git log --oneline -10.
Extracts keywords from the task title and description (stop-word filtered, minimum 3 chars, up to 5 keywords) and runs git grep -l -i <keyword> -- *.go *.ts *.js *.py for each.
Reads up to 10 matching source files, truncating files over 5,000 bytes.

TaskContext.FormatContext() renders the collected data as a Markdown string with sections for task metadata, task files, git status, recent commits, and related code — suitable for inclusion in a prompt sent to an AI model.

GatherRelatedFiles(task, dir) is the public sub-function that reads only the explicitly listed files.

CLI Client

Client is an HTTP client for the core-agentic REST API. It sends Bearer token authentication on every request.

Endpoints used:

Method	Path	Client function
GET	/api/tasks	ListTasks
GET	/api/tasks/{id}	GetTask
POST	/api/tasks	CreateTask
POST	/api/tasks/{id}/claim	ClaimTask
PATCH	/api/tasks/{id}	UpdateTask
POST	/api/tasks/{id}/complete	CompleteTask
GET	/api/health	Ping

ClaimTask accepts both ClaimResponse{"task": ...} and bare Task JSON to handle varying API implementations. checkResponse parses error bodies as APIError{code, message, details} or falls back to a generic HTTP status message.

Default timeout: 30 seconds. Default base URL: http://localhost:8080 (override with AGENTIC_BASE_URL).

Configuration

LoadConfig(dir) resolves configuration in this order:

.env file in the specified directory (or current directory if dir is empty), reading AGENTIC_BASE_URL, AGENTIC_TOKEN, AGENTIC_PROJECT, AGENTIC_AGENT_ID.
~/.core/agentic.yaml with the same fields in YAML form.
Environment variable overrides applied on top of whichever file was loaded.

A missing token always returns an error. SaveConfig writes to ~/.core/agentic.yaml, creating the directory if needed.

Event Hooks

All lifecycle events are published through the EventEmitter interface:

type EventEmitter interface {
    Emit(ctx context.Context, event Event) error
}

Event carries a typed string EventType, optional TaskID and AgentID, a Timestamp, and an untyped Payload.

Event Types

Type	Emitter	Trigger
`task_dispatched`	Dispatcher	Successful routing, quota check, and claim
`task_claimed`	Dispatcher	API claim call succeeded
`dispatch_failed_no_agent`	Dispatcher	`ErrNoEligibleAgent` from router
`dispatch_failed_quota`	Dispatcher	Allowance check denied
`task_dead_lettered`	Dispatcher	Task exceeded `MaxRetries`
`quota_warning`	AllowanceService	Agent at 80%+ of daily token limit
`quota_exceeded`	AllowanceService	Any limit check fails
`usage_recorded`	AllowanceService	`JobStarted` or `JobCompleted` events

Implementations

ChannelEmitter — buffered chan Event. Drops events silently if the buffer is full to avoid blocking the dispatch path. Default buffer size: 64. Use Events() to receive.

MultiEmitter — fans out to a slice of emitters. Failures in one emitter do not stop the others. Thread-safe: emitters can be added with Add() at any time.

Both Dispatcher and AllowanceService accept an emitter via SetEventEmitter(). A nil emitter is safe — all emit calls are no-ops.

CLI Backing Functions

These functions are intended to be called by core agent CLI commands implemented in the core/cli repository.

GetStatus(ctx, registry, client, allowanceSvc) — aggregates registry.List(), counts of pending and in-progress tasks from the client, and remaining tokens per agent from the allowance service. Any argument can be nil; those sections are skipped. Returns *StatusSummary.

FormatStatus(summary) — renders a StatusSummary as a human-readable table with agent rows sorted alphabetically by ID.

SubmitTask(ctx, client, title, description, labels, priority) — validates that title is non-empty, constructs a Task with StatusPending and current timestamp, and calls client.CreateTask.

StreamLogs(ctx, client, taskID, interval, writer) — polls client.GetTask at the given interval, writing timestamped status lines to writer. Stops on StatusCompleted or StatusBlocked. Transient errors are written to the stream but do not stop polling. Returns ctx.Err() on cancellation.

Prompt Templates

embed.go embeds all files under prompts/*.md at compile time. Prompt(name) reads a template by name (without .md), trims whitespace, and returns an empty string if the file does not exist. Currently one template is provided: prompts/commit.md, used by TaskCommit to guide Claude through generating a conventional commit.

Dependency Graph

go-agentic
    forge.lthn.ai/core/go         (ServiceRuntime, framework, io, log)
    forge.lthn.ai/core/go-store   (SQLite KV — SQLiteStore)
    github.com/redis/go-redis/v9  (RedisStore, RedisRegistry)
    gopkg.in/yaml.v3              (config file parsing)
    modernc.org/sqlite            (SQLiteRegistry — direct database/sql)
    github.com/stretchr/testify   (tests only)

The module uses Go workspace replace directives to reference ../go and ../go-store during development.

16 KiB Raw Blame History

go-agentic Architecture

Overview

Service Lifecycle

ServiceOptions

Allowance Management

AllowanceStore Interface

MemoryStore

SQLiteStore

RedisStore

Backend Selection

AllowanceService

Agent Registry

Interface

MemoryRegistry

SQLiteRegistry

RedisRegistry

Backend Selection

Task Queue and Dispatcher

Task Model

TaskRouter

Dispatcher

DispatchLoop

Context Builder

CLI Client

Configuration

Event Hooks

Event Types

Implementations

CLI Backing Functions

Prompt Templates

Dependency Graph

16 KiB

Raw Blame History