LEM/scripts/chat-4b.py
Snider 74ef174ec8 feat: add faithful 12B training scripts (P0-P6) — 1:1 port of 4B curriculum
Exact reproduction of all 7 CL-BPL phases for Gemma3-12B:
- P0: LEK sandwich ethics (400 iters, LR 2e-5)
- P1: Zen composure (300 iters, LR 1e-5)
- P2: LEK sandwich reinforcement (300 iters, LR 1e-5)
- P3: Freeflow multi-source (300 iters, LR 1e-5)
- P4: 1B teacher tension distillation (300 iters, LR 1e-5)
- P5: 1B teacher creative distillation (300 iters, LR 1e-5)
- P6: Golden set graduation (13479 iters, LR 1e-5)

Only model-size differences from 4B: 48GB/12GB Metal limits,
24 LoRA layers (vs 16), 12B base model path.

All phases score at checkpoint cadence via lem-scorer.
Previous wrong 12B models preserved as -no-axioms control group.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 20:44:03 +00:00

53 lines
1.2 KiB
Python

#!/usr/bin/env python3
"""Interactive chat with LEM-Gemma3-4B (graduated)."""
import sys
sys.stdout.reconfigure(line_buffering=True)
import mlx.core as mx
from mlx_lm import load, generate
from mlx_lm.sample_utils import make_sampler
mx.metal.set_memory_limit(24 * 1024**3)
mx.metal.set_cache_limit(8 * 1024**3)
MODEL_PATH = '/Volumes/Data/lem/models/LEM-Gemma3-4B'
print(f'Loading LEM-Gemma3-4B...')
model, tokenizer = load(MODEL_PATH)
_set_infer = getattr(model, 'eval')
_set_infer()
print('Ready.\n')
sampler = make_sampler(temp=0.7)
history = []
while True:
try:
user_input = input('You: ').strip()
except (EOFError, KeyboardInterrupt):
print('\nBye.')
break
if not user_input:
continue
if user_input.lower() == '/clear':
history = []
print('History cleared.\n')
continue
history.append({'role': 'user', 'content': user_input})
prompt_text = tokenizer.apply_chat_template(
history,
tokenize=False,
add_generation_prompt=True,
)
response = generate(model, tokenizer, prompt=prompt_text, max_tokens=512, sampler=sampler)
history.append({'role': 'assistant', 'content': response})
print(f'\nLEM: {response}\n')
mx.clear_cache()