1 Agent
Virgil edited this page 2026-02-19 18:36:27 +00:00

Agent Orchestrator

Overview

agent.go (1,070 LOC) is the largest file in the package. It orchestrates multi-model scoring runs across the evaluation infrastructure.

Capabilities

  • Multi-model runs: Score responses from multiple models in parallel
  • Remote infrastructure: Schedule scoring on M3 Ultra machines
  • Result consolidation: Aggregate scores across suites and models
  • Publishing: Export results to InfluxDB, Parquet, DuckDB

Data Pipeline

Ingest (ingest.go, import_all.go)
    ↓
Store (db.go — DuckDB, parquet.go — Parquet)
    ↓
Score (score.go → heuristic/judge/exact/probes)
    ↓
Consolidate (consolidate.go, compare.go)
    ↓
Export (export.go, publish.go, influx.go)

GGUF Model Management

gguf.go handles model inventory and metadata:

  • Scan directories for GGUF files
  • Extract model metadata (parameters, quantisation level)
  • Report model sizes and capabilities

Status Reporting

status.go provides real-time progress and summary reporting during long-running evaluation campaigns.