From ff9cf550e8efd0210d3efe6caefce17bce933a35 Mon Sep 17 00:00:00 2001
From: Claude <developers@lethean.io>
Date: Thu, 19 Feb 2026 20:41:53 +0000
Subject: [PATCH] docs: flag Token.ID and StopTokens interface questions for
 Virgil
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

QUESTION: Token.ID always 0 — llama-server SSE doesn't include token IDs
QUESTION: StopTokens []int32 vs llama-server stop []string mismatch

Co-Authored-By: Virgil <virgil@lethean.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---
 FINDINGS.md | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/FINDINGS.md b/FINDINGS.md
index bf7c94f..2ccfd19 100644
--- a/FINDINGS.md
+++ b/FINDINGS.md
@@ -172,3 +172,26 @@ cmd.Env = append(os.Environ(),
 ### Model Path Note
 
 Models are on SMB mount (`/data` = `//10.69.69.108/Data`). For CI/testing, copy a small model locally or use `t.Skip()` when the mount is unavailable.
+
+---
+
+## 2026-02-19: Phase 1 Plan Review — Interface Questions
+
+### QUESTION: Token.ID not populated by llama-server SSE
+
+llama-server's OpenAI-compatible streaming API (`/v1/chat/completions`, `/v1/completions`) does not include token IDs in the default SSE response. The `inference.Token` struct has `ID int32` and `Text string` — go-rocm will set `Text` but leave `ID` as 0 for all tokens.
+
+Token IDs are available via `logprobs: true` in the request, but this adds overhead and requires parsing the `logprobs.tokens` field.
+
+**Decision needed from Virgil:** Does any consumer (go-ml, go-i18n, go-ai) rely on `Token.ID`? If only `Token.Text` is used downstream, ID=0 is acceptable for Phase 1. If ID is needed, we'll add logprobs parsing.
+
+### QUESTION: StopTokens type mismatch
+
+`GenerateConfig.StopTokens` is `[]int32` (token IDs), but llama-server's OpenAI-compatible API expects `"stop"` as `[]string` (text sequences). These are fundamentally different — token IDs cannot be mapped to stop strings without a tokeniser.
+
+Options:
+1. Ignore `StopTokens` in go-rocm Phase 1 (no consumer uses it yet)
+2. Use llama-server's native `/completion` endpoint which supports `id_slot` stop tokens
+3. Add `StopStrings []string` to `GenerateConfig` in go-inference alongside the existing `StopTokens []int32`, let each backend use whichever it supports
+
+**Decision needed from Virgil:** Which approach? Option 3 would be a go-inference interface change. Option 1 is simplest for now — go-rocm silently ignores StopTokens if set.