go/FINDINGS.md
Snider 3ff7b8a773 docs: add orchestration dispatch queue and research findings
TODO.md tracks tasks dispatched to satellite repos (go-i18n phases 1-3).
FINDINGS.md records go-i18n architecture assessment and CoreDeno PR #9 review.
Phase 2 expanded with 1B classification pipeline based on LEK benchmarks.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-19 15:51:54 +00:00

3.5 KiB

FINDINGS.md — Core Go Research

go-i18n (forge.lthn.ai/core/go-i18n)

Explored: 2026-02-19 Location: /Users/snider/Code/host-uk/go-i18n Module: forge.lthn.ai/core/go-i18n State: 20 commits on main, clean, all tests pass Lines: ~5,800 across 32 files (14 test files) Deps: only golang.org/x/text

What It Is

A grammar engine — not a translation file manager. Provides:

  1. Forward composition: PastTense(), Gerund(), Pluralize(), Article(), handlers
  2. Reverse grammar: Tokeniser reads grammar tables backwards to extract structure
  3. GrammarImprint: Feature vector projection (content → grammar fingerprint, lossy)
  4. Multiplier: Deterministic training data augmentation (no LLM)

Consumers (core/cli, apps) bring their own translation files. go-i18n provides the grammar primitives.

Current Capabilities

Feature Status Notes
Grammar primitives (past/gerund/plural/article) Working 100 irregular verbs, 40 irregular nouns
Magic namespace handlers (i18n.label/progress/count/done/fail/numeric) Working 6 handler types
Service + message lookup Working Thread-safe, fallback chain
Subject builder (S()) Working Fluent API with count/gender/location/formality
Plural categories (CLDR) Working 7+ languages
RTL/LTR detection Working 12+ RTL languages
Number formatting Working Locale-specific separators
Reversal tokeniser Working 3-tier: JSON → irregular → regular morphology
GrammarImprint similarity Working Weighted cosine (verbs 30%, tense 20%, nouns 25%)
Multiplier expand Working Tense + number flipping, dedup, round-trip verify

What's Missing / Incomplete

Gap Priority Notes
Reference distribution builder High Process scored seeds → calibrate imprints
Non-English grammar tables Medium Only en.json exists, reversal needs gram.* per language
Ambiguity resolution Medium "run", "file", "test" are both verb and noun
Domain vocabulary expansion Low 150+ words, needs legal/medical/financial
Poindexter integration Deferred Awaiting Poindexter library
TIM container image Deferred Distroless Go binary for confidential compute

Key Architecture Decisions

  • Bijective grammar tables: Forward and reverse use same JSON → reversal is deterministic
  • Lossy projection: GrammarImprint intentionally loses content, preserves only structure
  • No LLM dependency: Multiplier generates variants purely from morphological rules
  • Consumer translations are external: go-i18n doesn't ship or manage app-specific locale files
  • gram. keys are sacred*: Agents MUST NOT flatten — grammar engine depends on nested structure

pkg/i18n in core/go

  • Full i18n framework with 34 locale files — but locale data is bad/stale
  • Only imported by pkg/cli/ which has been extracted to core/cli
  • Effectively orphaned in core/go
  • Can be removed once core/cli imports go-i18n directly
  • The locale files need full rework, not migration

CoreDeno (PR #9 — merged)

Explored: 2026-02-19

Deno sidecar for core-gui JS runtime. Go↔Deno bidirectional bridge:

  • Go→Deno: JSON-RPC over Unix socket (module lifecycle)
  • Deno→Go: gRPC over Unix socket (file I/O, store, manifest)
  • Each module in isolated Deno Worker with declared permissions
  • Marketplace: git clone + ed25519 manifest verification + SQLite registry

10 security/correctness issues found and fixed in review.