The Decode method strips the SentencePiece leading space from every
token, which loses word boundaries during streaming. DecodeToken
preserves the space (it represents the word boundary) and only the
first token of each generation has its leading space stripped.
Fixes Gemma3 space prefix appearing in chat UI output.
Co-Authored-By: Virgil <virgil@lethean.io>
Remove the manual -tags mlx requirement. MLX is now automatically
compiled on darwin/arm64 via build constraints. Stubs remain for
other platforms. No functional change.
Co-Authored-By: Virgil <virgil@lethean.io>
LEM scoring pipeline, native MLX Metal bindings, Claude SDK wrapper,
RAG with Qdrant/Ollama, unified AI facade, and MCP protocol server.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>