go-inference/docs/development.md
Snider 74b3e7a53d docs: graduate TODO/FINDINGS into production documentation
Replace internal task tracking (TODO.md, FINDINGS.md) with structured
documentation in docs/. Trim CLAUDE.md to agent instructions only.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-20 15:01:55 +00:00

8.1 KiB

Development Guide — go-inference

Prerequisites

  • Go 1.25 or later (uses iter.Seq from Go 1.23 and range-over-function from 1.22)
  • No CGO, no build tags, no external tools required
  • The package compiles on macOS, Linux, and Windows without modification

Commands

# Run all tests
go test ./...

# Run a single test by name
go test -run TestDefault_Good_Metal ./...

# Vet for common mistakes
go vet ./...

# View test coverage
go test -coverprofile=coverage.out ./...
go tool cover -html=coverage.out

There is no Taskfile in this package; it is small enough that direct go invocations suffice. The parent workspace (/Users/snider/Code/host-uk/core) uses Task for cross-repo operations.

Go Workspace

This package is part of the host-uk/core Go workspace. After adding or changing module dependencies:

go work sync

The workspace root is /Users/snider/Code/host-uk/core. The workspace file (go.work) includes this module alongside cmd/core-gui, cmd/bugseti, and others.

Module Path

forge.lthn.ai/core/go-inference

Import it in consumers:

import "forge.lthn.ai/core/go-inference"

Remote: ssh://git@forge.lthn.ai:2223/core/go-inference.git

Repository Layout

go-inference/
├── inference.go        # TextModel, Backend, Token, Message, registry, LoadModel
├── options.go          # GenerateConfig, LoadConfig, all With* options
├── discover.go         # Discover() and DiscoveredModel
├── inference_test.go   # Tests for registry, LoadModel, all types
├── options_test.go     # Tests for GenerateConfig, LoadConfig, all options
├── discover_test.go    # Tests for Discover()
├── go.mod
├── go.sum
├── CLAUDE.md           # Agent instructions
├── README.md
└── docs/
    ├── architecture.md
    ├── development.md
    └── history.md

Test Patterns

Tests follow the _Good, _Bad, _Ugly suffix convention used across the Core Go ecosystem:

  • _Good — happy path; confirms the documented behaviour works correctly
  • _Bad — expected error conditions; confirms errors are returned with useful messages
  • _Ugly — edge cases, panics, surprising-but-valid behaviour (e.g. last-option-wins, registry overwrites)
func TestDefault_Good_Metal(t *testing.T) { ... }
func TestDefault_Bad_NoBackends(t *testing.T) { ... }
func TestDefault_Ugly_SkipsUnavailablePreferred(t *testing.T) { ... }

Backend Registry Isolation

Tests that touch the global backend registry call resetBackends(t) first. This helper clears the map and is defined in inference_test.go:

func resetBackends(t *testing.T) {
    t.Helper()
    backendsMu.Lock()
    defer backendsMu.Unlock()
    backends = map[string]Backend{}
}

Because resetBackends is in the inference package (not inference_test), it has direct access to the unexported backends map. Tests must not rely on registration order across test functions; each test that uses the registry must call resetBackends at the top.

Stub Implementations

inference_test.go provides stubBackend and stubTextModel — minimal implementations of Backend and TextModel for use in registry and routing tests. These are in the inference package itself (not a separate _test package) to allow access to unexported fields.

When writing new tests, use the existing stubs rather than creating new ones unless you need behaviour the stubs do not support.

Table-Driven Tests

Prefer table-driven tests for options and configuration variants. The existing TestApplyGenerateOpts_Good, TestWithTemperature_Good, and TestDefault_Good_PriorityOrder tests demonstrate the pattern:

tests := []struct {
    name string
    val  float32
    want float32
}{
    {"greedy", 0.0, 0.0},
    {"low", 0.3, 0.3},
}
for _, tt := range tests {
    t.Run(tt.name, func(t *testing.T) {
        cfg := ApplyGenerateOpts([]GenerateOption{WithTemperature(tt.val)})
        assert.InDelta(t, tt.want, cfg.Temperature, 0.0001)
    })
}

Assertions

Use testify/assert and testify/require:

  • require for preconditions where failure makes subsequent assertions meaningless (e.g. require.NoError(t, err) before using the returned value)
  • assert for all other checks
  • assert.InDelta for float32/float64 comparisons (never ==)

Coding Standards

Language

UK English throughout: colour, organisation, centre, licence (noun), serialise, recognise. American spellings are not accepted in comments, documentation, or error messages.

Formatting

Standard gofmt formatting. No custom style rules. Run gofmt -w . or go fmt ./... before committing.

Error Messages

Error strings start with the package name and a colon, lowercase, no trailing period:

fmt.Errorf("inference: no backends registered (import a backend package)")
fmt.Errorf("inference: backend %q not registered", cfg.Backend)
fmt.Errorf("inference: backend %q not available on this hardware", cfg.Backend)

This convention matches the Go standard library and makes errors.Is/errors.As wrapping straightforward.

Strict Types

All parameters and return types are explicitly typed. No interface{} or any outside of test helpers where unavoidable.

Dependencies

No new external dependencies may be added to the production code. The go.mod require block must remain stdlib-only for non-test code. testify is the only permitted test dependency.

If you find yourself wanting an external library, reconsider the approach. This package is intentionally minimal.

Licence Header

Every new .go file must carry the EUPL-1.2 licence header:

// Copyright (c) Lethean Technologies Ltd. All rights reserved.
// SPDX-License-Identifier: EUPL-1.2

Existing files without this header will be updated in a future housekeeping pass.

Commit Guidelines

Use conventional commits:

type(scope): short imperative description

Longer explanation if needed. UK English. Wrap at 72 characters.

Types: feat, fix, test, docs, refactor, chore

Scope: inference, options, discover, or omit for cross-cutting changes.

Examples:

feat(inference): add WithParallelSlots load option
fix(discover): handle config.json with invalid JSON gracefully
test(options): add table-driven tests for WithTopP
docs: expand architecture section on registry priority

Always include the co-author trailer:

Co-Authored-By: Virgil <virgil@lethean.io>

Implementing a Backend

To implement a new backend (e.g. go-vulkan for cross-platform GPU inference):

  1. Import forge.lthn.ai/core/go-inference in the new module.
  2. Implement inference.Backend:
type vulkanBackend struct{}

func (b *vulkanBackend) Name() string { return "vulkan" }

func (b *vulkanBackend) Available() bool {
    // Check whether Vulkan runtime is present on this host.
    return vulkan.IsAvailable()
}

func (b *vulkanBackend) LoadModel(path string, opts ...inference.LoadOption) (inference.TextModel, error) {
    cfg := inference.ApplyLoadOpts(opts)
    // Load model using cfg.ContextLen, cfg.GPULayers, etc.
    return &vulkanModel{...}, nil
}
  1. Implement inference.TextModel (all nine methods).
  2. Register in init(), guarded by the appropriate build tag:
//go:build linux && (amd64 || arm64)

func init() { inference.Register(&vulkanBackend{}) }
  1. Write stub-based tests to confirm the backend registers and LoadModel routes correctly without requiring real GPU hardware in CI.

Extending the Interface

Before adding a method to TextModel or Backend, consider:

  • Do two or more existing consumers require this capability right now?
  • Can the capability be expressed as a separate interface that embeds TextModel?
  • Will adding this method break existing backend implementations that do not yet provide it?

If the answer to the first question is no, defer the addition. If a separate interface is sufficient, prefer that approach. See docs/architecture.md for the stability contract.

When a new method is genuinely necessary, coordinate with the owners of go-mlx, go-rocm, and go-ml before merging, since all three must implement the new method simultaneously or the interface will be broken at build time.