core/cli - Forgejo: Beyond coding. We Forge.

core/cli

Fork 0

Commit graph

Author	SHA1	Message	Date
Claude	56c6e2fa8d	feat: support quantized inference (4-bit) for Gemma 3 - Add QuantizedLinear with QuantizedMatmul for packed uint32 weights - Add quantized Embedding with Dequantize before lookup - Parse quantization config (group_size, bits) from config.json - Detect .scales/.biases weight tensors and auto-select quantized path - Add Dequantize op wrapping mlx_dequantize - Add safety guard to KVCache.Update for malformed shapes - Handle tied embeddings with quantization (AsLinear helper) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-16 05:53:52 +00:00
Claude	bc28aad526	feat: add native MLX backend for Apple Silicon inference (pkg/mlx) CGo wrapper for mlx-c providing zero-Python Metal GPU inference. Includes Gemma 3 model architecture, BPE tokenizer, KV cache, composable sampling, and OpenAI-compatible serve command. Build-tagged (darwin && arm64 && mlx) with stubs for cross-platform. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-16 05:53:52 +00:00

Author

SHA1

Message

Date

Claude

56c6e2fa8d

feat: support quantized inference (4-bit) for Gemma 3

- Add QuantizedLinear with QuantizedMatmul for packed uint32 weights
- Add quantized Embedding with Dequantize before lookup
- Parse quantization config (group_size, bits) from config.json
- Detect .scales/.biases weight tensors and auto-select quantized path
- Add Dequantize op wrapping mlx_dequantize
- Add safety guard to KVCache.Update for malformed shapes
- Handle tied embeddings with quantization (AsLinear helper)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-02-16 05:53:52 +00:00

Claude

bc28aad526

feat: add native MLX backend for Apple Silicon inference (pkg/mlx)

CGo wrapper for mlx-c providing zero-Python Metal GPU inference.
Includes Gemma 3 model architecture, BPE tokenizer, KV cache,
composable sampling, and OpenAI-compatible serve command.

Build-tagged (darwin && arm64 && mlx) with stubs for cross-platform.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-02-16 05:53:52 +00:00

2 commits