Commit graph

1 commit

Author SHA1 Message Date
Snider
c03c755104 feat(ml): add benchmark command for baseline vs trained model comparison
Some checks are pending
Security Scan / Go Vulnerability Check (push) Waiting to run
Security Scan / Secret Detection (push) Waiting to run
Security Scan / Dependency & Config Scan (push) Waiting to run
Runs the same prompts through baseline and fine-tuned models, scores
both with the heuristic scorer, and outputs a comparison report with
LEK score deltas and improvement/regression counts.

Uses built-in content probes by default, or custom prompts file.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-17 17:55:10 +00:00