core/cli - Forgejo: Beyond coding. We Forge.

core/cli

Author	SHA1	Message	Date
Claude	c5bc97de19	feat: port 11 LEM data management commands into core ml Ports all remaining LEM pipeline commands from pkg/lem into core ml, eliminating the standalone LEM CLI dependency. Each command is split into reusable business logic (pkg/ml/) and a thin cobra wrapper (internal/cmd/ml/). New commands: query, inventory, metrics, ingest, normalize, seed-influx, consolidate, import-all, approve, publish, coverage. Adds Path(), Exec(), QueryRowScan() convenience methods to DB type. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-16 05:53:52 +00:00
Claude	045f8fc110	feat: add Metal memory budget monitoring after each request Tracks model size at load time and checks Metal active memory after each generation. If usage exceeds 3× model size, forces double GC and cache clear as a safety net. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-16 05:53:52 +00:00
Claude	1d4ec55d05	fix: add GC-based memory management for MLX array handles Go GC cannot see Metal/C memory pressure, so intermediate arrays from each forward pass accumulated without bound, causing OOM kills after 3-4 requests. Fix: runtime.SetFinalizer on every Array releases C handles when GC collects them, and runtime.GC() is forced every 4 tokens during generation. Also adds SetMemoryLimit(24GB) as a hard Metal ceiling. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-16 05:53:52 +00:00
Claude	e6ada25bd8	fix: add Metal cache management to prevent memory growth - Add ClearCache() wrapping mlx_clear_cache - Clear Metal allocator cache every 8 tokens during generation - Set 16GB cache limit on backend init - Prevents GPU memory from growing unbounded during inference Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-16 05:53:52 +00:00
Claude	f76bf0f0c0	fix: correct SDPA mask mode and slice logits to last position	2026-02-16 05:53:52 +00:00
Claude	8ee0c4bc4e	feat: add native MLX backend for Apple Silicon inference (pkg/mlx) CGo wrapper for mlx-c providing zero-Python Metal GPU inference. Includes Gemma 3 model architecture, BPE tokenizer, KV cache, composable sampling, and OpenAI-compatible serve command. Build-tagged (darwin && arm64 && mlx) with stubs for cross-platform. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-16 05:53:52 +00:00
Claude	5ff4b8a2eb	feat: add ML inference, scoring, and training pipeline (pkg/ml) Port LEM scoring/training pipeline into CoreGo as pkg/ml with: - Inference abstraction with HTTP, llama-server, and Ollama backends - 3-tier scoring engine (heuristic, exact, LLM judge) - Capability and content probes for model evaluation - GGUF/safetensors format converters, MLX to PEFT adapter conversion - DuckDB integration for training data pipeline - InfluxDB metrics for lab dashboard - Training data export (JSONL + Parquet) - Expansion generation pipeline with distributed workers - 10 CLI commands under 'core ml' (score, probe, export, expand, status, gguf, convert, agent, worker) - 5 MCP tools (ml_generate, ml_score, ml_probe, ml_status, ml_backends) All 37 ML tests passing. Binary builds at 138MB with all commands. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-16 05:53:52 +00:00

7 commits