Agent Orchestrator
Overview
agent.go (1,070 LOC) is the largest file in the package. It orchestrates multi-model scoring runs across the evaluation infrastructure.
Capabilities
- Multi-model runs: Score responses from multiple models in parallel
- Remote infrastructure: Schedule scoring on M3 Ultra machines
- Result consolidation: Aggregate scores across suites and models
- Publishing: Export results to InfluxDB, Parquet, DuckDB
Data Pipeline
Ingest (ingest.go, import_all.go)
↓
Store (db.go — DuckDB, parquet.go — Parquet)
↓
Score (score.go → heuristic/judge/exact/probes)
↓
Consolidate (consolidate.go, compare.go)
↓
Export (export.go, publish.go, influx.go)
GGUF Model Management
gguf.go handles model inventory and metadata:
- Scan directories for GGUF files
- Extract model metadata (parameters, quantisation level)
- Report model sizes and capabilities
Status Reporting
status.go provides real-time progress and summary reporting during long-running evaluation campaigns.