feat/ml-integration #2

Merged
Snider merged 81 commits from feat/ml-integration into dev 2026-02-16 06:19:10 +00:00
Owner
No description provided.
Snider added 5 commits 2026-02-16 06:06:15 +00:00
Port LEM scoring pipeline into CoreGo pkg/ml/:
- Backend interface abstracting HTTP, llama-server, and future backends
- HTTPBackend for OpenAI-compatible APIs with retry logic
- LlamaBackend managing llama-server via pkg/process
- Scoring engine with heuristic, semantic, content, and exact suites
- Judge for LLM-based multi-dimensional scoring
- 23 capability probes (math, logic, reasoning, code)
- 6 sovereignty content probes
- GGUF/PEFT format helpers, safetensors reader
- 37 tests passing

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
refactor: rename module from github.com/host-uk/core to forge.lthn.ai/core/cli
Some checks are pending
Security Scan / Go Vulnerability Check (push) Waiting to run
Security Scan / Secret Detection (push) Waiting to run
Security Scan / Dependency & Config Scan (push) Waiting to run
3fdc3f3086
Move Go module path to production Forgejo instance.
Updates all imports, go.mod, go.sum, docs, and CI configs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
feat(ml): add format converters, data pipeline, and scoring agent
Some checks are pending
Security Scan / Go Vulnerability Check (push) Waiting to run
Security Scan / Secret Detection (push) Waiting to run
Security Scan / Dependency & Config Scan (push) Waiting to run
fcd1758b7d
Port remaining lem-repo components into pkg/ml/:
- convert.go: safetensors reader/writer, MLX→PEFT converter
- gguf.go: GGUF v3 writer, MLX→GGUF LoRA converter
- export.go: training data JSONL export with split/filter
- parquet.go: Parquet export with snappy compression
- db.go: DuckDB wrapper for golden set and expansion prompts
- influx.go: InfluxDB v3 client for metrics/status
- ollama.go: Ollama model management (create/delete with adapters)
- status.go: training and generation status display
- expand.go: expansion generation pipeline (Backend interface)
- agent.go: scoring agent with probe running and InfluxDB push
- worker.go: distributed worker for LEM API task processing

Adds parquet-go and go-duckdb dependencies.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
feat(ml): add CoreGo service wrapper and CLI commands (Tasks 6-7)
Some checks are pending
Security Scan / Go Vulnerability Check (push) Waiting to run
Security Scan / Secret Detection (push) Waiting to run
Security Scan / Dependency & Config Scan (push) Waiting to run
3dbb5988a8
Service registration with DI lifecycle, typed options, and backend
management. Ten CLI subcommands under `core ml` for scoring, probing,
export, expansion, status, GGUF/PEFT conversion, agent, and worker.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
feat(mcp): add ML tools subsystem and fix MCP service extension points
Some checks are pending
Security Scan / Go Vulnerability Check (push) Waiting to run
Security Scan / Secret Detection (push) Waiting to run
Security Scan / Dependency & Config Scan (push) Waiting to run
5fd7705580
Add 5 ML MCP tools (ml_generate, ml_score, ml_probe, ml_status,
ml_backends) as a Subsystem. Fix pre-existing gaps: add Subsystems(),
Shutdown(), WithProcessService, WithWSHub, WSHub(), ProcessService()
methods, and subsystem registration loop in New().

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Virgil added 75 commits 2026-02-16 06:15:27 +00:00
q.load() accesses shared state (issues, seen, current) without holding
the mutex, creating a race condition. Wrap the call with q.mu.Lock().

Fixes #52

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit updates the file permissions for the BugSETI configuration file from 0644 to 0600, ensuring owner-only access. This addresses the security concern where the GitHub token stored in the config file was world-readable.

Fixes #53
The workspaces map in WorkspaceService grew unboundedly. Add cleanup()
that evicts entries older than 24h and enforces a 100-entry cap by
removing oldest entries first. Called on each Capture().

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
SanitizeEnv() only removed control characters but not shell
metacharacters. A malicious repo name could execute arbitrary commands
via environment variable injection (e.g. backticks, $(), semicolons).

Add stripShellMeta() to strip backticks, dollar signs, semicolons,
pipes, ampersands, and other shell-significant characters from values
passed to the bash seed script environment.

Fixes #59

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add fetcher_test.go covering: service creation, start/pause lifecycle,
calculatePriority scoring for all label types, label query construction
with custom and default labels, gh CLI JSON parsing for both list and
single-issue endpoints, channel backpressure when issuesCh is full,
fetchAll with no repos configured, and missing binary error handling.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds a startup check that verifies gh is in PATH and authenticated
before initializing services. Provides clear install/auth instructions
on failure instead of cryptic exec errors at runtime.

Closes #61

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Capture and log the error from `git fetch origin` in createBranch()
instead of silently ignoring it. Warns the user they may be proceeding
with stale data if the fetch fails.

Fixes #62

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add sync.Mutex to SeederService to protect shared state during
concurrent SeedIssue, GetWorkspaceDir, and CleanupWorkspace calls.
Extract getWorkspaceDir as lock-free helper to avoid double-locking.

Closes #63

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Extract buildForkURL helper for testable fork URL construction and add
19 tests covering Submit validation, HTTPS/SSH fork URLs, PR body
generation, and ensureFork error handling.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add a live agent activity dashboard to the Core App Laravel frontend.
Provides real-time visibility into agent fleet status, job queue,
activity feed, metrics, and human-in-the-loop actions — replacing
SSH + tail -f as the operator interface.

Dashboard panels:
- Agent Fleet: grid of agent cards with heartbeat, status, model info
- Job Queue: filterable table with cancel/retry actions
- Live Activity Feed: real-time stream with agent/type filters
- Metrics: stat cards, budget gauge, cost breakdown, throughput chart
- Human Actions: inline question answering, review gate approval

Tech: Laravel Blade + Livewire 4 + Tailwind CSS + Alpine.js + ApexCharts

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implements the security wall between non-aligned agents (issue #97).

Adds pkg/trust with:
- Three trust tiers: Full (Tier 3), Verified (Tier 2), Untrusted (Tier 1)
- Agent registry with mutex-protected concurrent access
- Policy engine with capability-based access control
- Repo-scoped permissions for Tier 2 agents
- Default policies matching the spec (rate limits, approval gates, denials)
- 49 tests covering all tiers, capabilities, edge cases, and helpers

Closes #97

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add ForgejoClient and ForgejoService to the Laravel app, providing a
clean service layer for all Forgejo REST API operations the orchestrator
needs. Supports multiple instances (forge, dev, qa) with config-driven
auto-routing, token auth, retry with circuit breaker, and pagination.

Covers issues, PRs, repos, branches, user/token management, and orgs.

Closes #98

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implements quota enforcement for agents including daily token limits,
daily job limits, concurrent job caps, model allowlists, and global
per-model budgets. Quota recovery returns 50% for failed jobs and
100% for cancelled jobs.

Go: AllowanceService with MemoryStore, AllowanceStore interface, and
25 tests covering all enforcement paths.

Laravel: migration for 5 tables (agent_allowances, quota_usage,
model_quotas, usage_reports, repo_limits), Eloquent models,
AllowanceService, QuotaMiddleware, and REST API routes.

Closes #99

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Pass the API key via x-goog-api-key HTTP header instead of the URL
query parameter to prevent credential leakage in proxy logs, web
server access logs, and monitoring systems.

Resolves: #47 (CVSS 5.3, OWASP A09:2021)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move shared state initialization (issues, seen) and the load() call
inside the mutex scope in NewQueueService() to eliminate the race
window where concurrent callers could observe partially initialized
state. Remove the redundant heap.Init before the lock since load()
already calls heap.Init when restoring from disk.

Add documentation to save() and load() noting they must be called
with q.mu held.

Fixes #51

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Prevent path traversal in Journal.Append() by validating RepoOwner and
RepoName before using them in file paths. Malicious values like
"../../etc/cron.d" could previously write outside the journal baseDir.

Defence layers:
- Reject inputs containing path separators (/ or \)
- Reject ".." and "." traversal components
- Validate against safe character regex ^[a-zA-Z0-9][a-zA-Z0-9._-]*$
- Verify resolved absolute path stays within baseDir

Closes #46
CVSS 6.3 — OWASP A01:2021-Broken Access Control

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The workspace map previously only cleaned up during Capture() calls,
meaning stale entries would accumulate indefinitely if no new captures
occurred. This adds:

- Background sweeper goroutine (Start/Stop lifecycle) that runs every 5
  minutes to evict expired workspaces
- Configurable MaxWorkspaces and WorkspaceTTLMinutes in Config (defaults:
  100 entries, 24h TTL) replacing hardcoded constants
- cleanup() now returns eviction count for observability logging
- Nil-config fallback to safe defaults

Fixes #54

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Update Forgejo domain references in CI pipeline, vanity import
tool, and core-app codex prompt.

Co-Authored-By: Virgil <virgil@lethean.io>
Replace all exec.Command("gh", ...) calls with the existing pkg/forge
wrapper around the Forgejo Go SDK. BugSETI no longer requires the gh
CLI to be installed.

Changes:
- fetcher: use forge.ListIssues/GetIssue instead of gh issue list/view
- submit: use forge.ForkRepo/CreatePullRequest instead of gh pr create
- seeder: use git clone with forge URL + token auth instead of gh clone
- ghcheck: CheckForge() returns *forge.Client via forge.NewFromConfig()
- config: add ForgeURL/ForgeToken fields (GitHubToken kept for migration)
- pkg/forge: add Token(), GetCurrentUser(), ForkRepo(), CreatePullRequest(),
  ListIssueComments(), and label filtering to ListIssuesOpts

Co-Authored-By: Virgil <virgil@lethean.io>
Thin HTTP client for portal coordination API — issue claiming,
stats sync, leaderboard, auto-register via forge token.

Co-Authored-By: Virgil <virgil@lethean.io>
10 tasks covering Go client + Laravel auth endpoint.
TDD approach with httptest mocks.

Co-Authored-By: Virgil <virgil@lethean.io>
Add HubURL, HubToken, ClientID, and ClientName fields to Config struct
for agentic portal integration. Include getter/setter methods following
the existing pattern (SetForgeURL, SetForgeToken also added).

Co-Authored-By: Virgil <virgil@lethean.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Introduce HubService struct with types for hub coordination: PendingOp,
HubClaim, LeaderboardEntry, GlobalStats, ConflictError, NotFoundError.
Constructor generates a crypto/rand client ID when none exists. Includes
no-op loadPendingOps/savePendingOps stubs for future persistence.

Co-Authored-By: Virgil <virgil@lethean.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add doRequest() and doJSON() methods for hub API communication. doRequest
builds full URLs, sets bearer auth and JSON headers, tracks connected
state. doJSON handles status codes: 401 unauthorised, 409 ConflictError,
404 NotFoundError, and generic errors for other 4xx/5xx responses.

Co-Authored-By: Virgil <virgil@lethean.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Exchange a Forge API token for a hub API key by POSTing to
/api/bugseti/auth/forge. Skips if hub token already cached.
Adds drainPendingOps() stub for future Task 7 use.

Co-Authored-By: Virgil <virgil@lethean.io>
Add Register, Heartbeat, ClaimIssue, UpdateStatus, ReleaseClaim,
and SyncStats methods for hub coordination. ClaimIssue returns
ConflictError on 409 and calls drainPendingOps before mutating.

Co-Authored-By: Virgil <virgil@lethean.io>
Add IsIssueClaimed, ListClaims, GetLeaderboard, and GetGlobalStats
methods. IsIssueClaimed returns (nil, nil) on 404 for unclaimed
issues. GetLeaderboard returns entries and total participant count.

Co-Authored-By: Virgil <virgil@lethean.io>
Replace no-op stubs with real implementations for queueOp, drainPendingOps,
savePendingOps, and loadPendingOps. Operations are persisted to hub_pending.json
and replayed on next hub connection — 5xx/transport errors are retried, 4xx
responses are dropped as stale. Adds PendingCount() for queue inspection.

Co-Authored-By: Virgil <virgil@lethean.io>
Add HubService to the Wails service list and attempt hub registration
at startup when hubUrl is configured. Drains any pending operations
queued from previous sessions.

Co-Authored-By: Virgil <virgil@lethean.io>
The main.go was removed when Wails3 apps were added to cmd/, breaking
`go build .` for the core CLI. Restore it and update variants/full.go
to include daemon, forge, mcpcmd, prod, and session commands. Drop gitea
(superseded by forge) and unifi (unused).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Reviewed-on: https://forge.lthn.io/host-uk/core/pulls/161
Reviewed-by: Snider <snider@lethean.io>
Move module identity to our own Forgejo instance. All import paths
updated across 434 Go files, sub-module go.mod files, and go.work.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Port LEM scoring/training pipeline into CoreGo as pkg/ml with:
- Inference abstraction with HTTP, llama-server, and Ollama backends
- 3-tier scoring engine (heuristic, exact, LLM judge)
- Capability and content probes for model evaluation
- GGUF/safetensors format converters, MLX to PEFT adapter conversion
- DuckDB integration for training data pipeline
- InfluxDB metrics for lab dashboard
- Training data export (JSONL + Parquet)
- Expansion generation pipeline with distributed workers
- 10 CLI commands under 'core ml' (score, probe, export, expand, status, gguf, convert, agent, worker)
- 5 MCP tools (ml_generate, ml_score, ml_probe, ml_status, ml_backends)

All 37 ML tests passing. Binary builds at 138MB with all commands.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
CGo wrapper for mlx-c providing zero-Python Metal GPU inference.
Includes Gemma 3 model architecture, BPE tokenizer, KV cache,
composable sampling, and OpenAI-compatible serve command.

Build-tagged (darwin && arm64 && mlx) with stubs for cross-platform.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Use _axis/_axes variants for softmax, argmax, topk, sum, mean, squeeze,
  concatenate, argpartition
- Fix size_t vs int for count parameters throughout
- Fix int64_t strides in as_strided
- Add mlx_optional_int + mode param to quantized_matmul
- Use mlx_array_new() for null arrays (freqs, key, mask, sinks)
- Fix expand_dims to single-axis signature
- Fix compile callback signature (size_t index)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use pure C callback instead of //export to avoid const char* vs
GoString type mismatch in cgo-generated headers.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Supports both multimodal (Gemma3ForConditionalGeneration) and
text-only configs. Resolves weights with language_model. prefix
fallback. Computes head_dim from hidden_size when missing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Build-tagged backend selection: MLX on darwin/arm64/mlx, HTTP elsewhere.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Gemma 3 tokenizer.json uses [["a","b"],...] format for merges
instead of the ["a b",...] format. Support both.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add QuantizedLinear with QuantizedMatmul for packed uint32 weights
- Add quantized Embedding with Dequantize before lookup
- Parse quantization config (group_size, bits) from config.json
- Detect .scales/.biases weight tensors and auto-select quantized path
- Add Dequantize op wrapping mlx_dequantize
- Add safety guard to KVCache.Update for malformed shapes
- Handle tied embeddings with quantization (AsLinear helper)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add ClearCache() wrapping mlx_clear_cache
- Clear Metal allocator cache every 8 tokens during generation
- Set 16GB cache limit on backend init
- Prevents GPU memory from growing unbounded during inference

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Go GC cannot see Metal/C memory pressure, so intermediate arrays from
each forward pass accumulated without bound, causing OOM kills after
3-4 requests. Fix: runtime.SetFinalizer on every Array releases C
handles when GC collects them, and runtime.GC() is forced every 4
tokens during generation. Also adds SetMemoryLimit(24GB) as a hard
Metal ceiling.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The Go wrapper was tracking inter-array references via desc.inputs,
creating chains that kept all intermediate arrays alive across requests.
After 3-4 requests, Metal memory grew to 170GB+ and macOS killed the
process.

Fix: remove desc.inputs/numRefs entirely. MLX-C has its own internal
reference counting — when Go GC finalizes an Array wrapper, it calls
mlx_array_free which decrements the C-side refcount. If the C-side
count reaches 0, Metal memory is freed. Go GC + MLX-C refcounting
together handle all lifecycle management correctly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Tracks model size at load time and checks Metal active memory after
each generation. If usage exceeds 3× model size, forces double GC
and cache clear as a safety net.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Ports all remaining LEM pipeline commands from pkg/lem into core ml,
eliminating the standalone LEM CLI dependency. Each command is split
into reusable business logic (pkg/ml/) and a thin cobra wrapper
(internal/cmd/ml/).

New commands: query, inventory, metrics, ingest, normalize, seed-influx,
consolidate, import-all, approve, publish, coverage.

Adds Path(), Exec(), QueryRowScan() convenience methods to DB type.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
feat: integrate lab dashboard as core lab serve
Some checks are pending
Security Scan / Go Vulnerability Check (push) Waiting to run
Security Scan / Secret Detection (push) Waiting to run
Security Scan / Dependency & Config Scan (push) Waiting to run
da81534897
Port the standalone lab dashboard (lab.lthn.io) into the core CLI as
pkg/lab/ with collectors, handlers, and HTML templates. The dashboard
monitors machines, Docker containers, Forgejo, HuggingFace models,
training runs, and InfluxDB metrics with SSE live updates.

New command: core lab serve --bind :8080

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Five-phase ethics kernel testing across 4 local models (Gemma 3 12B,
Mistral 7B, DeepSeek V2 16B, Qwen 2.5 7B) proving that Google's
alignment training creates persistent ethical reasoning pathways in
Gemma that survive distillation.

- Phase 1: LEK-1 signed vs unsigned (Gemma 8.8/10 differential)
- Phase 2: Three-way test (unsigned vs LEK-1 vs Axioms of Life)
- Phase 3: Double-signed/sandwich signing mode comparison
- Phase 4: Multilingual filter mapping (EN/RU/CN bypass vectors)
- Phase 5: Hypnos POC training data + MLX LoRA on M3 Ultra

Key findings: sandwich signing optimal for training, DeepSeek CCP
alignment is weight-level (no prompt override), Russian language
bypasses DeepSeek content filters. LoRA POC mechanism confirmed
with 40 examples — needs 200+ for stable generalisation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Merge branch 'feat/ml-integration' into dev
Some checks are pending
Security Scan / Go Vulnerability Check (push) Waiting to run
Security Scan / Secret Detection (push) Waiting to run
Security Scan / Dependency & Config Scan (push) Waiting to run
48d385279b
# Conflicts:
#	.gh-actions/ISSUE_TEMPLATE/config.yml
#	.gh-actions/workflows/alpha-release-manual.yml
#	.gh-actions/workflows/alpha-release-push.yml
#	.gh-actions/workflows/alpha-release.yml
#	.gh-actions/workflows/bugseti-release.yml
#	.gh-actions/workflows/ci-manual.yml
#	.gh-actions/workflows/ci-pull-request.yml
#	.gh-actions/workflows/ci-push.yml
#	.gh-actions/workflows/ci.yml
#	.gh-actions/workflows/coverage-manual.yml
#	.gh-actions/workflows/coverage-pull-request.yml
#	.gh-actions/workflows/coverage-push.yml
#	.gh-actions/workflows/coverage.yml
#	.gh-actions/workflows/release.yml
#	cmd/bugseti/go.mod
#	cmd/bugseti/workspace.go
#	go.sum
#	internal/bugseti/submit.go
#	internal/bugseti/updater/go.mod
#	internal/cmd/ml/cmd_ml.go
#	internal/core-ide/go.mod
#	internal/variants/full.go
#	pkg/ml/db.go
Virgil added 1 commit 2026-02-16 06:18:22 +00:00
Merge branch 'feat/ml-integration' into HEAD
Some checks failed
Security Scan / Go Vulnerability Check (push) Has been cancelled
Security Scan / Secret Detection (push) Has been cancelled
Security Scan / Dependency & Config Scan (push) Has been cancelled
9960d231d0
# Conflicts:
#	cmd/bugseti/go.mod
#	internal/bugseti/submit.go
#	internal/core-ide/go.mod
Snider merged commit 4eb1e02f5e into dev 2026-02-16 06:19:10 +00:00
Snider referenced this pull request from a commit 2026-02-16 06:19:11 +00:00
Sign in to join this conversation.
No description provided.