{{template "head" "Dataset"}} {{template "nav" "dataset"}}
{{/* -- Sidebar -- */}}
Overview Golden Set {{if .GoldenSet.Available}}{{fmtInt .GoldenSet.TotalExamples}}{{end}} Seeds {{if .Dataset.Available}}{{fmtInt (tableRows .Dataset.Tables "seeds")}}{{end}} Domains Voices Expansion {{if .Dataset.Available}}{{fmtInt (tableRows .Dataset.Tables "expansion_prompts")}}{{end}} Export
{{/* -- Main content -- */}}
{{if not .SelectedView}} {{/* -- Overview -- */}}

LEM Dataset

{{if .GoldenSet.Available}}

Golden Set

{{fmtInt .GoldenSet.TotalExamples}}
{{pct .GoldenSet.CompletionPct}}% of {{fmtInt .GoldenSet.TargetTotal}} target
{{end}} {{if .Dataset.Available}}

Seeds

{{fmtInt (tableRows .Dataset.Tables "seeds")}}
Source prompts for generation

Expansion Prompts

{{fmtInt (tableRows .Dataset.Tables "expansion_prompts")}}
Ready for model expansion

Training Examples

{{fmtInt (tableRows .Dataset.Tables "training_examples")}}
Chat-format JSONL splits
{{end}} {{if .GoldenSet.Available}}

Domains

{{.GoldenSet.Domains}}
Topic categories

Voices

{{.GoldenSet.Voices}}
Persona types

Avg Generation

{{pct .GoldenSet.AvgGenTime}}s
{{pct .GoldenSet.AvgResponseChars}} avg chars
{{end}}
{{if .Dataset.Available}}

DuckDB Tables

{{$total := totalRows .Dataset.Tables}} {{range .Dataset.Tables}} {{end}}
TableRowsSize
{{.Name}} {{fmtInt .Rows}}
{{end}} {{else if eq .SelectedView "golden"}} {{/* -- Golden Set detail -- */}}

Golden Set

{{if not .GoldenSet.Available}}

No golden set data available.

{{else}}

Total Examples

{{fmtInt .GoldenSet.TotalExamples}}
{{pct .GoldenSet.CompletionPct}}% of {{fmtInt .GoldenSet.TargetTotal}}

Domains

{{.GoldenSet.Domains}}
Unique topic domains

Voices

{{.GoldenSet.Voices}}
Persona voice types

Avg Generation

{{pct .GoldenSet.AvgGenTime}}s
{{pct .GoldenSet.AvgResponseChars}} avg chars
{{if .GoldenSet.Workers}}

Workers

{{range .GoldenSet.Workers}} {{end}}
WorkerGenerations
{{.Worker}} {{fmtInt .Count}}
{{end}} {{end}} {{else if eq .SelectedView "seeds"}} {{/* -- Seeds -- */}}

Seeds

{{if .Dataset.Available}}

Total Seeds

{{fmtInt (tableRows .Dataset.Tables "seeds")}}
Source prompts in DuckDB

Prompts Generated

{{fmtInt (tableRows .Dataset.Tables "prompts")}}
Processed from seeds
{{else}}

Seeds

87,338
Push stats via dataset_stats
{{end}}

Seed browser coming soon. Use lem export --seeds to explore locally.

{{else if eq .SelectedView "domains"}} {{/* -- Domains -- */}}

Domains

{{if and .GoldenSet.Available .GoldenSet.DomainStats}}

Total Domains

{{.GoldenSet.Domains}}
Unique topic categories

Total Examples

{{fmtInt .GoldenSet.TotalExamples}}
Across all domains

Distribution (top 25)

{{domainChart .GoldenSet.DomainStats}}

All Domains

{{range .GoldenSet.DomainStats}} {{end}}
DomainCountAvg Gen TimeCoverage
{{.Domain}} {{.Count}} {{pct .AvgGenTime}}s
{{else}}

No domain data available.

{{end}} {{else if eq .SelectedView "voices"}} {{/* -- Voices -- */}}

Voices

{{if and .GoldenSet.Available .GoldenSet.VoiceStats}}

Total Voices

{{.GoldenSet.Voices}}
Persona types

Total Examples

{{fmtInt .GoldenSet.TotalExamples}}
Across all voices

Distribution

{{voiceChart .GoldenSet.VoiceStats}}

Voice Details

{{range .GoldenSet.VoiceStats}} {{end}}
VoiceCountAvg CharsAvg Gen Time
{{.Voice}} {{.Count}} {{pct .AvgChars}} {{pct .AvgGenTime}}s
{{else}}

No voice data available.

{{end}} {{else if eq .SelectedView "expansion"}} {{/* -- Expansion -- */}}

Expansion

{{if .Dataset.Available}}

Expansion Prompts

{{fmtInt (tableRows .Dataset.Tables "expansion_prompts")}}
Deduped, ready for generation

Gemini Responses

{{fmtInt (tableRows .Dataset.Tables "gemini_responses")}}
Reference responses for scoring

Benchmark Questions

{{fmtInt (tableRows .Dataset.Tables "benchmark_questions")}}
Capability test set

Benchmark Results

{{fmtInt (tableRows .Dataset.Tables "benchmark_results")}}
Scored responses
{{else}}

Expansion Prompts

46,331
Push stats via dataset_stats
{{end}}

Expansion pipeline: use lem expand to generate responses from trained models, then lem score to filter by quality.

{{else if eq .SelectedView "export"}} {{/* -- Export -- */}}

Export

{{if .Dataset.Available}}

Training Examples

{{fmtInt (tableRows .Dataset.Tables "training_examples")}}
Chat-format JSONL

Validations

{{fmtInt (tableRows .Dataset.Tables "validations")}}
Quality checks
{{end}}

Export formats:

FormatCommandUse
JSONL (MLX) lem export --format jsonl MLX LoRA training (train/valid/test splits)
Parquet lem export --format parquet HuggingFace dataset upload
CSV lem export --format csv Spreadsheet analysis
{{end}}
{{template "footer"}}