go-ai/docs/ml-pipeline.md

145 lines
5 KiB
Markdown
Raw Permalink Normal View History

---
title: ML Pipeline
description: ML scoring, model management, and inference backend integration.
---
# ML Pipeline
The ML pipeline in go-ai connects the MCP server to the scoring and inference capabilities provided by `go-ml` and `go-inference`. Five MCP tools expose generation, scoring, capability probes, and backend discovery.
## Architecture
```
MCP Client
| tools/call ml_generate
v
MLSubsystem (go-ai/mcp/tools_ml.go)
|
+-- ml.Service (go-ml)
| +-- InferenceAdapter --> inference.TextModel (go-inference)
| +-- ScoringEngine (heuristic scores)
| +-- JudgeBackend (LLM-as-judge)
|
+-- inference.List() / inference.Get() / inference.Default()
+-- go-mlx (Metal GPU, macOS)
+-- go-rocm (AMD ROCm, Linux)
+-- Ollama (HTTP subprocess)
```
## ML Tools
### `ml_generate`
Generate text using the active inference backend.
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `prompt` | `string` | Yes | The text prompt |
| `model` | `string` | No | Model name (uses default if omitted) |
| `max_tokens` | `int` | No | Maximum tokens to generate |
| `temperature` | `float64` | No | Sampling temperature |
Returns the generated text and metadata about which backend and model were used.
### `ml_score`
Score content using the heuristic scoring engine. Supports three scoring modes:
- **Heuristic** -- Pattern-based scoring across multiple dimensions (emotional register, sycophancy detection, vocabulary diversity, etc.)
- **Semantic** -- LLM-as-judge evaluation using a secondary model
- **Content** -- Combined scoring pipeline
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `content` | `string` | Yes | Text to score |
| `mode` | `string` | No | `heuristic`, `semantic`, or `content` |
Returns dimension-level scores and an aggregate verdict.
### `ml_probe`
Run capability probes against the active model. Probes test specific model capabilities (instruction following, reasoning, factual recall, etc.). There are 23 built-in probes.
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `probe` | `string` | No | Specific probe name (runs all if omitted) |
### `ml_status`
Query the ML pipeline status, including active backends, loaded models, and InfluxDB pipeline health.
### `ml_backends`
List all registered inference backends and their availability status. Calls `inference.List()`, `inference.Get()`, and `inference.Default()` from the `go-inference` registry.
Returns an array of backends with their names, availability flags, and which is currently the default.
## Inference Backend Registry
The `go-inference` module provides a global registry for inference backends. Backends are registered at startup and can be queried at runtime:
```go
// Querying available backends (used by ml_backends tool)
backends := inference.List() // All registered backends
backend := inference.Get("mlx") // Specific backend by name
def := inference.Default() // Currently active default
```
### Supported Backends
| Backend | Module | Platform | Description |
|---------|--------|----------|-------------|
| MLX | `go-mlx` | macOS (Apple Silicon) | Native Metal GPU inference |
| ROCm | `go-rocm` | Linux (AMD GPU) | AMD ROCm GPU inference via llama-server |
| Ollama | `go-ml` | Any | HTTP-based inference via Ollama subprocess |
## Scoring Engine
The scoring engine in `go-ml` provides heuristic analysis across multiple dimensions. Each dimension produces a normalised score (0.0 to 1.0) and a qualitative verdict.
Scoring dimensions include:
- Emotional register (positive and negative pattern detection)
- Sycophancy detection
- Vocabulary diversity
- Sentence complexity
- Repetition analysis
- Format adherence
The `ml_score` tool delegates directly to `go-ml`'s `ml.Service` rather than routing through `go-inference`, since the scoring engine is specific to go-ml and not an abstract backend capability.
## Integration with the MCP Server
The ML subsystem is registered as a plugin during MCP server construction:
```go
svc, err := mcp.New(
mcp.WithSubsystem(mcp.NewMLSubsystem(mlSvc)),
)
```
`MLSubsystem` implements the `Subsystem` interface and registers all five ML tools when `RegisterTools` is called.
## Testing
ML tools can be tested with mock backends that satisfy the `ml.Backend` and `inference.Backend` interfaces:
```go
type mockMLBackend struct {
name string
available bool
generateResp string
generateErr error
}
func (m *mockMLBackend) Name() string { return m.name }
func (m *mockMLBackend) Available() bool { return m.available }
```
Register lightweight mocks for CI environments where GPU backends and model weights are not available:
```go
inference.Register(&mockInferenceBackend{name: "test-ci-mock", available: true})
```
Note that `inference.Register` is global state -- use unique names to avoid conflicts between parallel test runs.