docs: graduate TODO/FINDINGS into production documentation

Replace internal task tracking (TODO.md, FINDINGS.md) with structured
documentation in docs/. Trim CLAUDE.md to agent instructions only.

Co-Authored-By: Virgil <virgil@lethean.io>
This commit is contained in:
Snider 2026-02-20 15:01:55 +00:00
parent aae5d21ad4
commit 53edd80476
6 changed files with 632 additions and 115 deletions

View file

@ -1,54 +1,38 @@
# CLAUDE.md
## Project
`go-html` is an HLCRF DOM compositor with grammar pipeline. Module path: `forge.lthn.ai/core/go-html`
Agent instructions for `go-html`. Module path: `forge.lthn.ai/core/go-html`
## Commands
```bash
go test ./... # Run all tests
go test -run TestName ./... # Single test
go test -bench . ./... # Benchmarks
go vet ./... # Static analysis
GOOS=js GOARCH=wasm go build -o gohtml.wasm ./cmd/wasm/ # WASM build
go test ./... # Run all tests
go test -run TestName ./... # Single test
go test -short ./... # Skip slow WASM build test
go test -bench . ./... # Benchmarks
go vet ./... # Static analysis
GOOS=js GOARCH=wasm go build -ldflags="-s -w" -o gohtml.wasm ./cmd/wasm/ # WASM build
echo '{"H":"nav-bar","C":"main-content"}' | go run ./cmd/codegen/ # Codegen CLI
```
## Architecture
See `docs/architecture.md` for full detail. Summary:
- **Node interface**: `Render(ctx *Context) string` — El, Text, Raw, If, Unless, Each[T], Switch, Entitled
- **HLCRF Layout**: Header/Left/Content/Right/Footer compositor with ARIA roles and deterministic `data-block` IDs
- **Responsive**: Multi-variant breakpoint wrapper (`data-variant` attributes)
- **Pipeline**: Render → StripTags → go-i18n/reversal Tokenise → GrammarImprint
- **Codegen**: Web Component classes with closed Shadow DOM
- **WASM**: `cmd/wasm/` exports `renderToString()` only (lean client-side renderer, ~830KB gzip)
- **Codegen CLI**: `cmd/codegen/` reads slot JSON from stdin, writes WC bundle JS to stdout (build-time tool, not in WASM)
- **Pipeline**: Render → StripTags → go-i18n/reversal Tokenise → GrammarImprint (server-side only)
- **Codegen**: Web Component classes with closed Shadow DOM, generated at build time by `cmd/codegen/`
- **WASM**: `cmd/wasm/` exports `renderToString()` only — 2.90 MB raw / 842 KB gzip
## Server/Client Split
Files guarded with `//go:build !js` are excluded from WASM builds:
- `pipeline.go` — Imprint/CompareVariants use `go-i18n/reversal` (heavyweight, server-side only)
Files guarded with `//go:build !js` are excluded from WASM:
- `pipeline.go` — Imprint/CompareVariants use `go-i18n/reversal` (server-side only)
- `cmd/wasm/register.go` — encoding/json + codegen (replaced by `cmd/codegen/` CLI)
WASM binary contains only: node types, layout, responsive, context, render, path, and go-i18n core (translation).
## Dependencies
- `forge.lthn.ai/core/go-i18n` (replace directive → `../go-i18n`)
- go-i18n must be present alongside this repo for builds
## Coding Standards
- UK English (colour, organisation, centre)
- All types annotated
- Tests use `testify` assert/require
- Licence: EUPL-1.2
- Safe-by-default: HTML escaping on Text nodes, void element handling, entitlement deny-by-default
- Deterministic output: sorted attributes, reproducible paths
## Test Conventions
No specific suffix pattern — use table-driven subtests with `t.Run()`.
Never import `encoding/json`, `text/template`, or `fmt` in WASM-linked code. Use string concatenation instead of `fmt.Sprintf` in `layout.go` and any other file without a `!js` guard.
## Key Files
@ -56,9 +40,29 @@ No specific suffix pattern — use table-driven subtests with `t.Run()`.
|------|---------|
| `node.go` | All node types (El, Text, Raw, If, Unless, Each, Switch, Entitled) |
| `layout.go` | HLCRF compositor |
| `pipeline.go` | StripTags, Imprint, CompareVariants |
| `pipeline.go` | StripTags, Imprint, CompareVariants (!js only) |
| `responsive.go` | Multi-variant breakpoint wrapper |
| `context.go` | Rendering context (Identity, Locale, Entitlements, i18n Service) |
| `codegen/codegen.go` | Web Component class generation |
| `cmd/wasm/main.go` | WASM entry point (`renderToString` only) |
| `cmd/wasm/main.go` | WASM entry point (renderToString only) |
| `cmd/codegen/main.go` | Build-time CLI for WC bundle generation |
| `cmd/wasm/size_test.go` | WASM binary size gate (< 1 MB gzip, < 3 MB raw) |
## Dependencies
- `forge.lthn.ai/core/go-i18n` (replace directive → `../go-i18n`)
- `go-i18n` and `go-inference` must be present alongside this repo for builds
## Coding Standards
- UK English (colour, organisation, centre)
- All types annotated
- Tests use `testify` assert/require
- Licence: EUPL-1.2 — add `// SPDX-Licence-Identifier: EUPL-1.2` to new files
- Safe-by-default: HTML escaping on Text nodes, void element handling, entitlement deny-by-default
- Deterministic output: sorted attributes, reproducible paths
- Commits: conventional commits + `Co-Authored-By: Virgil <virgil@lethean.io>`
## Test Conventions
No specific suffix pattern. Use table-driven subtests with `t.Run()`. Integration tests that use `Text` nodes must call `i18n.SetDefault(svc)` before rendering.

View file

@ -1,35 +0,0 @@
# Findings
## Code Quality
- **53 tests, 100% pass** — excellent coverage ratios across all packages
- **Zero TODOs/FIXMEs** in codebase — clean
- **`go vet` clean** — no static analysis warnings
- **Safe-by-default design** — XSS prevention verified in render_test.go, HTML escaping on all Text nodes, void elements self-close, entitlements deny-by-default
## Architecture Strengths
- Clean minimal API: 9 public constructors + Node interface
- Type-safe generics: `Each[T]` for iteration
- Deterministic output: sorted attributes, reproducible block IDs
- Fluent builder pattern: `NewLayout("HLCRF").H(node).C(node).F(node)`
- Pipeline bridges rendering to privacy layer (GrammarImprint via go-i18n reversal)
## Known Issues
1. **WASM size blocker** — 6.0 MB raw / 1.58 MB gzip. Root cause: stdlib imports (json, encoding, text/template) bloat the WASM binary. Makefile rejects at 1 MB gzip threshold.
2. **No WASM main_test.go** — cmd/wasm/ has register_test.go but no integration test for the JS exports.
3. **Layout accepts invalid variants silently**`NewLayout("XYZ")` renders nothing, no error returned.
4. **Context.service is private** — Must use `NewContextWithService()`. Limits custom i18n adapter injection.
5. **Responsive only accepts *Layout** — Cannot nest arbitrary nodes in variants, must wrap in Layout first.
## Coverage Gaps
| File | Lines | Tests |
|------|-------|-------|
| node.go | 254 | 206 lines of tests (81%) |
| layout.go | 119 | 116 lines (97%) |
| pipeline.go | 83 | 128 lines (154%) |
| responsive.go | 39 | 89 lines (228%) |
| codegen.go | 90 | 54 lines (60%) |
| cmd/wasm/main.go | 78 | **0 lines (0%)** |

45
TODO.md
View file

@ -1,45 +0,0 @@
# TODO
## High Priority — WASM Binary Size Fix
~~Current: 6.04 MB raw / 1.58 MB gzip.~~ Fixed: 2.90 MB raw / 830 KB gzip. Target: < 1 MB gzip (Makefile gate: WASM_GZ_LIMIT 1048576).
Root cause: `registerComponents()` pulled in `encoding/json` (~200KB gz), `text/template` (~125KB gz), and `fmt` (~50KB gz). Plus `pipeline.go` linked `go-i18n/reversal` (~250KB gz). These were heavyweight imports for code that doesn't need to run client-side.
### Step 1: Remove `registerComponents()` from WASM
- [x] **Move `cmd/wasm/register.go` out of WASM** — Added `//go:build !js` build tag. The `registerComponents()` JS bridge in `main.go` removed. This removes `encoding/json` and `text/template` from the binary.
- [x] **Move codegen to build-time CLI** — Created `cmd/codegen/main.go` that reads slot config from stdin (JSON) and writes generated JS to stdout. Usage: `echo '{"H":"nav-bar","C":"main-content"}' | go run ./cmd/codegen/ > components.js`. Consumers pre-generate during build.
- [x] **Update `cmd/wasm/main.go`** — Removed `registerComponents` from the `gohtml` JS object. Only exposes `renderToString`.
### Step 2: Remove Pipeline from WASM
- [x] **Guard `pipeline.go` with build tag** — Added `//go:build !js` to `pipeline.go`. The `Imprint()` and `CompareVariants()` functions use `go-i18n/reversal` which is heavyweight. Server-side analysis only.
- [x] **Update `cmd/wasm/main.go`** — No references to `pipeline.go` functions. `renderToString` never used them.
### Step 3: Minimise `fmt` Usage
- [x] **Replace `fmt.Sprintf` in WASM-linked code** — Replaced `fmt.Sprintf` in `layout.go` `blockID()` with string concatenation. `fmt` eliminated from the WASM import graph.
### Step 4: Verify Size
- [x] **Build and measure** — 2,900,777 bytes raw, 830,314 bytes gzip (842,146 via `make wasm`). Well under 1 MB limit.
- [x] **Document the server/client split** — Updated CLAUDE.md with new architecture: WASM = `renderToString()` only, codegen = build-time CLI.
### Step 5: Tests
- [x] **WASM build gate test**`TestWASMBinarySize` in `cmd/wasm/size_test.go`: builds WASM, gzips, asserts < 1MB gzip and < 3MB raw. Result: 2.90MB raw, 842KB gzip. `//go:build !js` guarded.
- [x] **Codegen CLI test**`cmd/codegen/main_test.go`: pipe JSON stdin, verify JS output matches `GenerateBundle()`
- [x] **renderToString still works** — Existing WASM tests for `renderToString` pass (build-tag guarded)
- [x] **Existing tests still pass**`go test ./...` (non-WASM) all 70+ tests pass, pipeline/codegen tests unaffected
## Medium Priority
- [ ] **TypeScript type definitions** — Add `.d.ts` generation alongside `GenerateBundle()` for Web Component consumers.
- [ ] **Accessibility helpers** — Layout has semantic HTML + ARIA roles but no `aria-label` builder, alt text helpers, or focus management nodes.
- [ ] **Layout variant validation**`NewLayout("XYZ")` silently produces empty output. Add warning or error for invalid slot characters.
## Low Priority
- [ ] **Browser polyfill documentation** — Document closed Shadow DOM support matrix.
- [ ] **CSS scoping helper** — Optional utility for responsive variant CSS targeting.

217
docs/architecture.md Normal file
View file

@ -0,0 +1,217 @@
# Architecture
`go-html` is an HLCRF DOM compositor with grammar pipeline integration. It provides a pure-Go, type-safe HTML rendering library designed for server-side generation with an optional lightweight WASM client module.
Module path: `forge.lthn.ai/core/go-html`
## Node Interface
All renderable units implement a single interface:
```go
type Node interface {
Render(ctx *Context) string
}
```
Every node type is a private struct with a public constructor. The API surface is intentionally small: nine public constructors plus `Attr()` and `Render()` helpers.
| Constructor | Description |
|-------------|-------------|
| `El(tag, ...Node)` | HTML element with children |
| `Attr(Node, key, value)` | Set attribute on an El node; chainable |
| `Text(key, ...any)` | Translated, HTML-escaped text via go-i18n |
| `Raw(content)` | Unescaped trusted content |
| `If(cond, Node)` | Conditional render |
| `Unless(cond, Node)` | Inverse conditional render |
| `Each[T](items, fn)` | Type-safe iteration with generics |
| `Switch(selector, cases)` | Runtime dispatch to named cases |
| `Entitled(feature, Node)` | Entitlement-gated render; deny-by-default |
### Safety guarantees
- `Text` nodes are always HTML-escaped. XSS via user-supplied strings fed through `Text()` is not possible.
- `Raw` is an explicit escape hatch for trusted content only. Its name signals intent.
- `Entitled` returns an empty string when no entitlement function is set on the context. Access is denied by default, not granted.
- `El` attributes are sorted alphabetically before output, producing deterministic HTML regardless of insertion order.
- Void elements (`br`, `img`, `input`, etc.) never emit a closing tag.
## HLCRF Layout
The `Layout` type is a compositor for five named slots: **H**eader, **L**eft, **C**ontent, **R**ight, **F**ooter. Each slot maps to a specific semantic HTML element and ARIA role:
| Slot | Element | ARIA role |
|------|---------|-----------|
| H | `<header>` | `banner` |
| L | `<aside>` | `complementary` |
| C | `<main>` | `main` |
| R | `<aside>` | `complementary` |
| F | `<footer>` | `contentinfo` |
A layout variant string selects which slots are rendered and in which order:
```go
NewLayout("HLCRF") // all five slots
NewLayout("HCF") // header, content, footer — no sidebars
NewLayout("C") // content only
```
Each rendered slot receives a deterministic `data-block` attribute encoding its position in the tree. The root layout produces IDs in the form `{slot}-0` (e.g., `H-0`, `C-0`). Nested layouts extend the parent's block ID as a path prefix: a `Layout` placed inside the `L` slot of a root layout will produce inner slot IDs like `L-0-H-0`, `L-0-C-0`.
This path scheme is computed without `fmt.Sprintf` — using simple string concatenation — to keep `fmt` out of the WASM import graph.
### Nested layouts
`Layout` implements `Node`, so it can be placed inside any slot of another layout. At render time, nested layouts are cloned and their `path` field is set to the parent's block ID. This clone-on-render approach avoids shared mutation and is safe for concurrent use.
```go
inner := NewLayout("HCF").H(Raw("nav")).C(Raw("body")).F(Raw("links"))
outer := NewLayout("HLCRF").H(Raw("top")).L(inner).C(Raw("main")).F(Raw("foot"))
```
### Fluent builder
All slot methods return the `*Layout` for chaining. Multiple nodes may be appended to the same slot across multiple calls:
```go
NewLayout("HCF").
H(El("h1", Text("Title"))).
C(El("p", Text("Content")), Raw("<hr>")).
F(El("small", Text("Copyright")))
```
## Responsive Compositor
`Responsive` wraps multiple named `Layout` variants for breakpoint-aware rendering. Each variant renders inside a `<div data-variant="name">` container, giving CSS media queries or JavaScript a stable hook for show/hide logic.
```go
NewResponsive().
Variant("desktop", NewLayout("HLCRF")...).
Variant("tablet", NewLayout("HCF")...).
Variant("mobile", NewLayout("C")...)
```
`Responsive` itself implements `Node` and may be passed to `Imprint()` for cross-variant semantic analysis.
Note: `Responsive.Variant()` accepts only `*Layout`, not arbitrary `Node` values. Arbitrary subtrees must be wrapped in a layout first.
## Rendering Context
`Context` carries per-request state through the entire node tree:
```go
type Context struct {
Identity string
Locale string
Entitlements func(feature string) bool
Data map[string]any
service *i18n.Service // private; set via NewContextWithService()
}
```
The `service` field is intentionally unexported. Custom i18n adapter injection requires `NewContextWithService(svc)`. This prevents callers from setting it inconsistently after construction.
When `ctx.service` is nil, `Text` nodes fall back to the global `i18n.T()` default service.
## Grammar Pipeline
The grammar pipeline is a server-side-only feature. It is guarded with `//go:build !js` and absent from all WASM builds.
### StripTags
`StripTags(html string) string` converts rendered HTML to plain text. Tag boundaries are collapsed to single spaces; the result is trimmed. The implementation is a single-pass rune scanner: no regex, no allocations beyond the output builder. It does not attempt to elide `<script>` or `<style>` content because `go-html` never generates those elements.
### Imprint
`Imprint(node Node, ctx *Context) reversal.GrammarImprint` runs the full render-to-analysis pipeline:
1. Call `node.Render(ctx)` to produce HTML.
2. Pass HTML through `StripTags` to extract plain text.
3. Pass plain text through `go-i18n/reversal.Tokeniser` to produce a token sequence.
4. Wrap tokens in a `reversal.GrammarImprint` for structural analysis.
The resulting `GrammarImprint` exposes `TokenCount`, `UniqueVerbs`, and a `Similar()` method for pairwise semantic similarity scoring. This bridges the rendering layer to the privacy and analytics layers of the Lethean stack.
### CompareVariants
`CompareVariants(r *Responsive, ctx *Context) map[string]float64` runs `Imprint` on each named layout variant in a `Responsive` and returns pairwise similarity scores. Keys are `"name1:name2"`. This enables detection of semantically divergent responsive variants — for example, a mobile layout that strips critical information that appears in the desktop variant.
## Server/Client Split
The binary split is enforced by Go build tags.
| File | Build tag | Reason for exclusion from WASM |
|------|-----------|-------------------------------|
| `pipeline.go` | `//go:build !js` | Imports `go-i18n/reversal` (~250 KB gzip) |
| `cmd/wasm/register.go` | `//go:build !js` | Imports `encoding/json` (~200 KB gzip) and `text/template` (~125 KB gzip) |
The WASM binary includes only: node types, layout, responsive, context, render, path, and go-i18n core (translation). No codegen, no pipeline, no JSON, no templates, no `fmt`.
## WASM Module
The WASM entry point is `cmd/wasm/main.go`, compiled with `GOOS=js GOARCH=wasm`.
It exposes a single JavaScript function on `window.gohtml`:
```js
gohtml.renderToString(variant, locale, slots)
```
- `variant`: HLCRF variant string, e.g. `"HCF"`.
- `locale`: BCP 47 locale string for i18n, e.g. `"en-GB"`.
- `slots`: object with optional keys `H`, `L`, `C`, `R`, `F` containing HTML strings.
Slot content is injected via `Raw()`. The caller is responsible for sanitisation. This is intentional: the WASM module is a rendering engine for trusted content produced server-side or by the application's own templates.
### Size gate
`cmd/wasm/size_test.go` contains `TestWASMBinarySize_Good`, a build-gated test that:
1. Builds the WASM binary with `-ldflags=-s -w`.
2. Gzip-compresses the output at best compression.
3. Asserts the compressed size is below 1,048,576 bytes (1 MB).
4. Asserts the raw size is below 3,145,728 bytes (3 MB).
This test is skipped under `go test -short`. It is guarded with `//go:build !js` so it does not run within the WASM environment itself. Current measured size: 2.90 MB raw, 842 KB gzip.
## Codegen CLI
`cmd/codegen/main.go` is a build-time tool for generating Web Component JavaScript bundles from HLCRF slot assignments. It reads a JSON slot map from stdin and writes the generated JS to stdout.
```bash
echo '{"H":"nav-bar","C":"main-content"}' | go run ./cmd/codegen/ > components.js
```
The `codegen` package generates ES2022 class definitions with closed Shadow DOM. The generated pattern per component:
- A class extending `HTMLElement` with a private `#shadow` field.
- `constructor()` attaches a closed shadow root (`mode: "closed"`).
- `connectedCallback()` dispatches a `wc-ready` custom event with the tag name and slot.
- `render(html)` sets shadow content from a `<template>` clone.
- `customElements.define()` registration.
Closed Shadow DOM provides style isolation. Content is set via the DOM API, never via `innerHTML` directly on the element.
Tag names must contain a hyphen (Web Components specification requirement). `TagToClassName()` converts kebab-case tags to PascalCase class names: `nav-bar` becomes `NavBar`.
The codegen CLI uses `encoding/json` and `text/template`, which are excluded from the WASM build. Consumers generate the JS bundle at build time, not at runtime.
## Block ID Path Scheme
`path.go` exports `ParseBlockID(id string) []byte`, which extracts the slot letter sequence from a `data-block` attribute value.
Format: slots are separated by `-0-`. The sequence `L-0-C-0` decodes to `['L', 'C']`, meaning the content slot of a layout nested inside the left slot.
This scheme is deterministic and human-readable. It enables server-side or client-side code to locate a specific block in the rendered tree by path.
## Dependency Graph
```
go-html
├── forge.lthn.ai/core/go-i18n (direct, all builds)
│ └── forge.lthn.ai/core/go-inference (indirect)
├── forge.lthn.ai/core/go-i18n/reversal (server builds only, !js)
└── github.com/stretchr/testify (test only)
```
Both `go-i18n` and `go-html` are developed in parallel. The `go.mod` uses a `replace` directive pointing to `../go-i18n`. Both repositories must be present on the local filesystem for builds and tests.

257
docs/development.md Normal file
View file

@ -0,0 +1,257 @@
# Development Guide
## Prerequisites
- Go 1.25 or later (Go workspace required).
- `go-i18n` repository cloned alongside this one: `../go-i18n` relative to the repository root. The `go.mod` `replace` directive points there.
- `go-inference` also resolved via `replace` directive at `../go-inference`. It is an indirect dependency pulled in by `go-i18n`.
- `testify` is the only external test dependency; it is fetched by the Go module system.
No additional tools are required for server-side development. WASM builds require a standard Go installation with `GOOS=js GOARCH=wasm` cross-compilation support, which is included in all official Go distributions.
## Directory Layout
```
go-html/
├── node.go Node interface and all node types
├── layout.go HLCRF compositor
├── pipeline.go StripTags, Imprint, CompareVariants (!js only)
├── responsive.go Multi-variant breakpoint wrapper
├── context.go Rendering context
├── render.go Render() convenience function
├── path.go ParseBlockID() for data-block path decoding
├── codegen/
│ └── codegen.go Web Component JS generation (server-side)
├── cmd/
│ ├── codegen/
│ │ └── main.go Build-time CLI (stdin JSON → stdout JS)
│ └── wasm/
│ ├── main.go WASM entry point (js+wasm build only)
│ ├── register.go buildComponentJS helper (!js only)
│ └── size_test.go WASM binary size gate test (!js only)
└── docs/
└── plans/ Phase design documents (historical)
```
## Running Tests
```bash
# All tests
go test ./...
# Single test by name
go test -run TestWASMBinarySize_Good ./cmd/wasm/
# Skip slow WASM build test
go test -short ./...
# Tests with verbose output
go test -v ./...
```
Tests use `testify` assert and require helpers. Test names follow Go's standard `TestFunctionName` convention. Subtests use `t.Run()` with descriptive names.
The WASM size gate test (`TestWASMBinarySize_Good`) builds the WASM binary as a subprocess and is therefore slow. It is skipped automatically under `-short`. It is also guarded with `//go:build !js` so it cannot run under `GOARCH=wasm`.
## Benchmarks
```bash
# All benchmarks
go test -bench . ./...
# Specific benchmark
go test -bench BenchmarkRender_FullPage ./...
# With memory allocations
go test -bench . -benchmem ./...
# Fixed iteration count
go test -bench . -benchtime=5s ./...
```
Benchmarks are organised by operation:
| Group | Variants |
|-------|---------|
| `BenchmarkRender_*` | Depth 1, 3, 5, 7 trees; full page |
| `BenchmarkLayout_*` | Content-only, HCF, HLCRF, nested, many children |
| `BenchmarkEach_*` | 10, 100, 1000 items |
| `BenchmarkResponsive_*` | Three-variant compositor |
| `BenchmarkStripTags_*` | Short and long HTML inputs |
| `BenchmarkImprint_*` | Small and large page trees |
| `BenchmarkCompareVariants_*` | Two and three variant comparison |
## WASM Build
```bash
GOOS=js GOARCH=wasm go build -ldflags="-s -w" -o gohtml.wasm ./cmd/wasm/
```
Strip flags (`-s -w`) are required. Without them the binary is approximately 50% larger.
The Makefile target `make wasm` performs the build and measures the gzip size:
```bash
make wasm
```
The Makefile enforces a 1 MB gzip limit (`WASM_GZ_LIMIT = 1048576`). The build fails if this limit is exceeded.
To verify the size manually:
```bash
gzip -c -9 gohtml.wasm | wc -c
```
Current measured output: 2.90 MB raw, 842 KB gzip.
## Codegen CLI
The codegen CLI reads a JSON slot map from stdin and writes a Web Component JS bundle to stdout. It is a build-time tool, not intended for runtime use.
```bash
# Generate components for a two-slot layout
echo '{"H":"site-header","C":"app-content","F":"site-footer"}' \
| go run ./cmd/codegen/ \
> components.js
```
The JSON keys are HLCRF slot letters (`H`, `L`, `C`, `R`, `F`). The values are custom element tag names (must contain a hyphen). Duplicate tag values are deduplicated.
To test the CLI:
```bash
go test ./cmd/codegen/
```
## Static Analysis
```bash
go vet ./...
```
The codebase passes `go vet` with no warnings.
## Coding Standards
### Language
UK English throughout: colour, organisation, centre, behaviour, licence (noun), serialise. American spellings are not used.
### Types
All exported and unexported functions carry full parameter and return type annotations. The `any` alias is used in preference to `interface{}`.
### Error handling
Errors are wrapped with context using `fmt.Errorf("pkg.Function: %w", err)`. The codegen package prefixes all errors with `codegen:`.
### HTML safety
- Use `Text()` for any user-supplied or translated content. It escapes HTML.
- Use `Raw()` only for content you control or have sanitised upstream.
- Never construct HTML by string concatenation in application code.
### Determinism
Output must be deterministic. Attributes are sorted before rendering. `map` iteration in `codegen.GenerateBundle()` may produce non-deterministic class order across runs — this is acceptable because Web Component registration order does not affect correctness.
### Build tags
Files excluded from WASM use `//go:build !js` as the first line, before the `package` declaration. Files compiled only under WASM use `//go:build js && wasm`. Do not use the older `// +build` syntax.
### Licence
All files carry the EUPL-1.2 SPDX identifier:
```go
// SPDX-Licence-Identifier: EUPL-1.2
```
### Commit format
Conventional commits with lowercase type and optional scope:
```
feat(codegen): add TypeScript type definition generation
fix(wasm): correct slot injection for empty strings
test: add edge case for Unicode surrogate pairs
docs: update architecture with pipeline diagram
```
Commits include a co-author trailer:
```
Co-Authored-By: Virgil <virgil@lethean.io>
```
## Test Patterns
### Standard unit test
```go
func TestElNode_Render(t *testing.T) {
ctx := NewContext()
node := El("div", Raw("content"))
got := node.Render(ctx)
want := "<div>content</div>"
if got != want {
t.Errorf("El(\"div\", Raw(\"content\")).Render() = %q, want %q", got, want)
}
}
```
### Table-driven subtest
```go
func TestStripTags(t *testing.T) {
cases := []struct {
name string
input string
want string
}{
{"empty", "", ""},
{"plain", "hello", "hello"},
{"single tag", "<p>hello</p>", "hello"},
{"nested", "<div><p>a</p><p>b</p></div>", "a b"},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
got := StripTags(tc.input)
if got != tc.want {
t.Errorf("StripTags(%q) = %q, want %q", tc.input, got, tc.want)
}
})
}
}
```
### Integration test with i18n
```go
func TestIntegration_RenderThenReverse(t *testing.T) {
svc, _ := i18n.New()
i18n.SetDefault(svc)
ctx := NewContext()
page := NewLayout("HCF").
H(El("h1", Text("Building project"))).
C(El("p", Text("Files deleted successfully"))).
F(El("small", Text("Completed")))
imp := Imprint(page, ctx)
if imp.UniqueVerbs == 0 {
t.Error("reversal found no verbs in rendered page")
}
}
```
Integration tests that exercise the full pipeline (`Imprint`, `CompareVariants`) must initialise the i18n default service before calling `Text` nodes. The `bench_test.go` `init()` function does this for benchmarks; individual integration tests must do so explicitly.
## Known Limitations
- `NewLayout("XYZ")` silently produces empty output when given unrecognised slot letters. There is no warning or error. Valid slot letters are `H`, `L`, `C`, `R`, `F`.
- `Responsive.Variant()` accepts only `*Layout`, not arbitrary `Node` values. Arbitrary subtrees must be wrapped in a single-slot layout.
- `Context.service` is private. Custom i18n adapter injection requires `NewContextWithService()`. There is no way to set or swap the service after construction.
- `cmd/wasm/main.go` has no integration test for the JS exports. The `size_test.go` file tests binary size only; it does not exercise `renderToString` behaviour.

119
docs/history.md Normal file
View file

@ -0,0 +1,119 @@
# Project History
## Phase 1: Core Node Types (initial scaffolding)
Commits: `d7bb0b2` through `c724094`
The module was scaffolded with the Go module path `forge.lthn.ai/core/go-html`. The foundational work established:
- `d7bb0b2` — Module scaffold, `Node` interface with `Render(ctx *Context) string`.
- `3e76e72``Text` node wired to `go-i18n` grammar pipeline with HTML escaping.
- `c724094` — Conditional nodes (`If`, `Unless`), entitlement gating (`Entitled`, deny-by-default), runtime dispatch (`Switch`), and type-safe iteration (`Each[T]`).
The `Raw` escape hatch was present from the first commit. The decision to make `Text` always escape and `Raw` never escape was made at this stage and has not changed.
## Phase 2: HLCRF Layout and Pipeline
Commits: `946ea8d` through `ef77793`
- `946ea8d``Layout` type with HLCRF slot registry. Semantic HTML elements (`<header>`, `<main>`, `<aside>`, `<footer>`) and ARIA roles assigned per slot.
- `d75988a` — Nested layout path chains. Block IDs computed as `{slot}-0` at root, extended with `{parent}-{slot}-0` for nested layouts.
- `40da0d8` — Deterministic attribute sorting and thread-safe nested layout cloning (clone-on-render pattern).
- `f49ddbf``Attr()` helper for setting element attributes with chaining.
- `e041f76``Responsive` multi-variant compositor with `data-variant` containers.
- `8ac5123``StripTags` single-pass rune scanner for HTML-to-text stripping.
- `76cef5a``Imprint()` full render-reverse-imprint pipeline using `go-i18n/reversal`.
- `ef77793``CompareVariants()` pairwise semantic similarity scoring across responsive variants.
## Phase 3: WASM Entry Point
Commits: `456adce` through `9bc1fa7`
- `456adce` — Makefile with `wasm` target. Size gate: `WASM_GZ_LIMIT = 1048576` (1 MB). Initial measurement revealed the binary was already too large at this stage.
- `5acf63c` — WASM entry point `cmd/wasm/main.go` with `renderToString` exported to `window.gohtml`.
- `2fab89e` — Integration tests refactored to use `Imprint` pipeline.
- `e34c5c9` — WASM browser test harness added.
- `18d2933` — WASM binary size reporting improvements.
- `9bc1fa7` — Variant name escaping in `Responsive`, single-pass `StripTags` optimisation, WASM security contract documented in source.
## Phase 4: Codegen and Web Components
Commits: `937c08d` through `ab7ab92`
- `37b50ab`, `496513e` — Phase 4 design documents and implementation plan.
- `937c08d``codegen` package with `GenerateClass`, `GenerateBundle`, `TagToClassName`. Web Component classes with closed Shadow DOM.
- `dcd55a4``registerComponents` export added to `cmd/wasm/main.go`, bridging JSON slot config to WC bundle JS. This was the source of the subsequent binary size problem.
- `ab7ab92` — Transitive `replace` directive added for `go-inference`.
## WASM Binary Size Reduction
Commits: `6abda8b`, `4c65737`, `aae5d21`
The initial WASM binary measured 6.04 MB raw / 1.58 MB gzip — 58% over the 1 MB gzip limit set in the Makefile. The root causes were three heavyweight stdlib imports pulled in by `registerComponents()` in the WASM binary:
| Import | Approx. gzip contribution |
|--------|--------------------------|
| `encoding/json` | ~200 KB |
| `text/template` | ~125 KB |
| `fmt` (via `layout.go`) | ~50 KB |
| `go-i18n/reversal` (via `pipeline.go`) | ~250 KB |
**Total bloat**: ~625 KB gzip over the core rendering requirement.
The fix was applied in three distinct steps:
### Step 1: Remove registerComponents from WASM (`4c65737`)
`cmd/wasm/register.go` received a `//go:build !js` build tag, completely excluding it from the WASM compilation unit. The `registerComponents` entry on the `gohtml` JS object was removed from `cmd/wasm/main.go`. The codegen function was moved to a standalone build-time CLI at `cmd/codegen/main.go`. This eliminated `encoding/json` and `text/template` from the WASM import graph.
### Step 2: Remove pipeline from WASM
`pipeline.go` received a `//go:build !js` build tag. The `Imprint()` and `CompareVariants()` functions depend on `go-i18n/reversal`, which is a heavyweight analysis library. These functions are server-side analysis tools and have no use in a client-side rendering module. The `renderToString` function in the WASM entry point never called them, so removal was non-breaking.
### Step 3: Eliminate fmt from WASM
`layout.go`'s `blockID()` method had used `fmt.Sprintf` for string construction. Replacing this with direct string concatenation (`l.path + string(slot) + "-0"`) removed `fmt` from the WASM import graph entirely.
**Result**: 2.90 MB raw, 842 KB gzip. 47% reduction in gzip size. Well within the 1 MB limit.
### Size gate test (`aae5d21`)
`cmd/wasm/size_test.go` was added to prevent regression. `TestWASMBinarySize_Good` builds the WASM binary in a temp directory, gzip-compresses it, and asserts:
- Gzip size < 1,048,576 bytes (1 MB).
- Raw size < 3,145,728 bytes (3 MB).
The test is skipped under `go test -short` and is guarded with `//go:build !js`.
## Test Coverage Milestones
- `7efd2ab` — Benchmarks added across all subsystems. Unicode edge case tests. Stress tests.
- `ab7ab92` — 53 passing tests across the package and sub-packages.
- `aae5d21` — 70+ tests passing (server-side); WASM size gate and codegen CLI tests added.
## Known Limitations (as of current HEAD)
These are not regressions; they are design choices or deferred work recorded for future consideration.
1. **Invalid layout variants are silent.** `NewLayout("XYZ")` produces empty output. No error, no warning. Adding validation would require changing the return type of `NewLayout` from `*Layout` to `(*Layout, error)`, which is a breaking API change.
2. **No WASM integration test.** `cmd/wasm/size_test.go` tests binary size only. The `renderToString` behaviour is tested by building and running the WASM binary in a browser, not by an automated test. A `syscall/js`-compatible test harness would be needed.
3. **Responsive accepts only Layout.** `Responsive.Variant()` takes `*Layout` rather than `Node`. The rationale is that `CompareVariants` in the pipeline needs access to the slot structure. Accepting `Node` would require a different approach to variant analysis.
4. **Context.service is private.** The i18n service cannot be set after construction or swapped. This is a conservative choice; relaxing it requires deciding whether mutation should be safe for concurrent use.
5. **TypeScript definitions not generated.** `codegen.GenerateBundle()` produces JS only. A `.d.ts` companion would benefit TypeScript consumers of the generated Web Components.
6. **No CSS scoping helper.** Responsive variants are identified by `data-variant` attributes. Targeting them from CSS requires knowledge of the attribute name. An optional utility for generating scoped CSS selectors is deferred.
7. **Browser polyfill matrix not documented.** Closed Shadow DOM is well-supported but older browsers require polyfills. The support matrix is not documented.
## Future Considerations
These items were captured during the WASM size reduction work and expert review sessions. They are not committed work items.
- **TypeScript type definitions** alongside `GenerateBundle()` for typed Web Component consumers.
- **Accessibility helpers**`aria-label` builder, `alt` text helpers, focus management nodes. The layout has semantic HTML and ARIA roles but no API for fine-grained accessibility attributes beyond `Attr()`.
- **Layout variant validation** — return a warning or sentinel error from `NewLayout` when the variant string contains unrecognised slot characters.
- **Daemon mode for codegen** — watch mode for regenerating the JS bundle when slot config changes, for development workflows.