1
0
Fork 0
forked from lthn/LEM
Commit graph

4 commits

Author SHA1 Message Date
Snider
bd2f376a7a feat: add zen training set (Allen) to training/lem/zen/
10 examples across train/test/valid splits.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-22 00:02:47 +00:00
Snider
f65fd777ea feat: convert composure library to training JSONL format
Add cmd/composure-convert tool that chunks public domain philosophical
texts into training conversation pairs:
- consent.jsonl (198 examples) — Wollstonecraft's Vindication
- privacy.jsonl (221 examples) — Thoreau's Walden
- sovereignty.jsonl (56 examples) — Mill's On Liberty
- transparency.jsonl (159 examples) — Aurelius' Meditations

Each example pairs a domain-specific prompt with ~5 paragraphs from
the source text. Metadata, chapter headings, and Gutenberg boilerplate
are filtered out.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-21 23:59:06 +00:00
Snider
de18a0fb93 refactor: move composure-library to training/lem/composure/
Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-21 23:55:17 +00:00
Snider
d233e76648 feat: add training data to repo + make paths repo-relative
Move training/lem/ (probes, lessons, eval sets) into git so the
full curriculum is publicly releasable. Update .core/ai configs
and distill.go to use repo-relative paths instead of /Volumes/Data/.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-21 23:49:12 +00:00