Snider 07cd917259 feat: define shared TextModel, Backend, Token, Message interfaces

Zero-dependency interface package for the Core inference ecosystem.
Backends (go-mlx, go-rocm) implement these interfaces.
Consumers (go-ml, go-ai, go-i18n) import them.

Includes:
- TextModel: Generate, Chat, Err, Close (with context.Context)
- Backend: Named engine registry with platform preference
- Functional options: WithMaxTokens, WithTemperature, WithTopK, etc.
- LoadModel: Auto-selects best available backend

Co-Authored-By: Virgil <virgil@lethean.io>

2026-02-19 19:37:27 +00:00

2.4 KiB

Raw Blame History

CLAUDE.md

What This Is

Shared inference interfaces for the Core Go ecosystem. Module: forge.lthn.ai/core/go-inference

This package defines the contract between GPU-specific backends (go-mlx on macOS, go-rocm on Linux) and consumers (go-ml, go-ai, go-i18n). It has zero dependencies and compiles on all platforms.

Commands

go test ./...        # Run all tests
go vet ./...         # Vet

Architecture

go-inference (this package) ← defines TextModel, Backend, Token, Message
    ↑                    ↑
    │                    │
go-mlx (darwin/arm64)   go-rocm (linux/amd64)
    │                    │
    └────── go-ml ───────┘   (wraps backends into scoring engine)
             ↑
          go-ai (MCP hub)

Key Types

Type	Purpose
`TextModel`	Core interface: Generate, Chat, Err, Close
`Backend`	Named engine that can LoadModel → TextModel
`Token`	Streaming token (ID + Text)
`Message`	Chat message (Role + Content)
`GenerateOption`	Functional option for generation (temp, topK, etc.)
`LoadOption`	Functional option for model loading (backend, GPU layers, etc.)

Backend Registry

Backends register via init() with build tags. Consumers call LoadModel() which auto-selects the best available backend:

// Auto-detect: Metal on macOS, ROCm on Linux
m, err := inference.LoadModel("/path/to/model/")

// Explicit backend
m, err := inference.LoadModel("/path/", inference.WithBackend("rocm"))

Coding Standards

UK English
Zero external dependencies — stdlib only
Tests: testify assert/require
Conventional commits
Co-Author: Co-Authored-By: Virgil <virgil@lethean.io>
Licence: EUPL-1.2

Consumers

go-mlx: Implements Backend + TextModel for Apple Metal (darwin/arm64)
go-rocm: Implements Backend + TextModel for AMD ROCm (linux/amd64)
go-ml: Wraps inference backends into scoring engine, adds llama.cpp HTTP backend
go-ai: MCP hub, exposes inference via MCP tools
go-i18n: Uses TextModel for Gemma3-1B domain classification

Stability

This package is the shared contract. Changes here affect all backends and consumers. Keep the interface minimal and stable. Add new methods only when two or more consumers need them.

Task Queue

See TODO.md for prioritised work. See FINDINGS.md for research notes.

2.4 KiB Raw Blame History