LEM/scripts at da3db34e45ae1fb6a12aeb6b162953eee11d6b74 - lthn/LEM

lthn/LEM

Template

History

Charon abd63d3342 Add standard benchmark suite using EleutherAI lm-evaluation-harness - run_benchmarks.sh: wrapper for lm-eval with suite presets (quick, classic, leaderboard-v2, full) - compare_models.py: compare base vs LEK results with delta table - Supports HF transformers, local-chat-completions (MLX/Ollama), and vLLM backends - Results comparable to HuggingFace Open LLM Leaderboard Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>		2026-02-15 00:05:48 +00:00
..
compare_models.py	Add standard benchmark suite using EleutherAI lm-evaluation-harness	2026-02-15 00:05:48 +00:00
compare_v1_v2.py	Add regional seeds, expansion rounds, scripts, HF cards, benchmark summary	2026-02-13 13:39:08 +00:00
export_parquet.py	Add missing HF model cards, sync script, and Parquet export	2026-02-14 23:50:18 +00:00
lem_benchmark.py	LEM+LEK	2026-02-12 04:05:28 +00:00
lem_cross_arch_benchmark.py	Add cross-architecture training and benchmarking scripts; update README and PAPER with author and repository information	2026-02-12 09:07:32 +00:00
lem_cross_arch_train.py	Add cross-architecture training and benchmarking scripts; update README and PAPER with author and repository information	2026-02-12 09:07:32 +00:00
lem_gemini3_generate.py	Add regional seeds, expansion rounds, scripts, HF cards, benchmark summary	2026-02-13 13:39:08 +00:00
lem_gemini3flash_generate.py	Add regional seeds, expansion rounds, scripts, HF cards, benchmark summary	2026-02-13 13:39:08 +00:00
lem_gemini25flash_generate.py	Add regional seeds, expansion rounds, scripts, HF cards, benchmark summary	2026-02-13 13:39:08 +00:00
lem_scale_benchmark.py	Benchmark & Findings:	2026-02-12 06:38:46 +00:00
lem_scale_scorer.py	Benchmark & Findings:	2026-02-12 06:38:46 +00:00
lem_scorer.py	LEM+LEK	2026-02-12 04:05:28 +00:00
lem_self_scorer.py	Add cross-architecture training and benchmarking scripts; update README and PAPER with author and repository information	2026-02-12 09:07:32 +00:00
lem_semantic_scorer.py	LEM+LEK	2026-02-12 04:05:28 +00:00
lem_standard_benchmark.py	LEM+LEK	2026-02-12 04:05:28 +00:00
lem_standard_scorer.py	LEM+LEK	2026-02-12 04:05:28 +00:00
push_all_models.py	Add regional seeds, expansion rounds, scripts, HF cards, benchmark summary	2026-02-13 13:39:08 +00:00
run_benchmarks.sh	Add standard benchmark suite using EleutherAI lm-evaluation-harness	2026-02-15 00:05:48 +00:00
sync_hf.py	Add missing HF model cards, sync script, and Parquet export	2026-02-14 23:50:18 +00:00