docs: add human-friendly documentation
All checks were successful
Security Scan / security (push) Successful in 7s
Test / test (push) Successful in 42s

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Snider 2026-03-11 13:02:40 +00:00
parent 473fda5894
commit 5bbacc54fe
3 changed files with 426 additions and 226 deletions

View file

@ -1,12 +1,15 @@
---
title: Architecture
description: Internals of the go-html HLCRF DOM compositor, covering the node interface, layout system, responsive wrapper, grammar pipeline, WASM module, and codegen CLI.
---
# Architecture
`go-html` is an HLCRF DOM compositor with grammar pipeline integration. It provides a pure-Go, type-safe HTML rendering library designed for server-side generation with an optional lightweight WASM client module.
Module path: `forge.lthn.ai/core/go-html`
`go-html` is structured around a single interface, a layout compositor, and a server-side analysis pipeline. Everything renders to `string` -- there is no virtual DOM, no diffing, and no retained state between renders.
## Node Interface
All renderable units implement a single interface:
Every renderable unit implements one method:
```go
type Node interface {
@ -14,204 +17,282 @@ type Node interface {
}
```
Every node type is a private struct with a public constructor. The API surface is intentionally small: nine public constructors plus `Attr()` and `Render()` helpers.
All concrete node types are unexported structs with exported constructor functions. The public API surface consists of nine constructors plus the `Attr()` and `Render()` helpers:
| Constructor | Description |
|-------------|-------------|
| `El(tag, ...Node)` | HTML element with children |
| `Attr(Node, key, value)` | Set attribute on an El node; chainable |
| `Text(key, ...any)` | Translated, HTML-escaped text via go-i18n |
| `Raw(content)` | Unescaped trusted content |
| `If(cond, Node)` | Conditional render |
| `Unless(cond, Node)` | Inverse conditional render |
| `Each[T](items, fn)` | Type-safe iteration with generics |
| `Switch(selector, cases)` | Runtime dispatch to named cases |
| `Entitled(feature, Node)` | Entitlement-gated render; deny-by-default |
| Constructor | Behaviour |
|-------------|-----------|
| `El(tag, ...Node)` | HTML element with children. Void elements (`br`, `img`, `input`, etc.) never emit a closing tag. |
| `Attr(Node, key, value)` | Sets an attribute on an `El` node. Traverses through `If`, `Unless`, and `Entitled` wrappers. Returns the node for chaining. |
| `Text(key, ...any)` | Translated text via `go-i18n`. Output is always HTML-escaped. |
| `Raw(content)` | Unescaped trusted content. Explicit escape hatch. |
| `If(cond, Node)` | Renders the child only when the condition function returns true. |
| `Unless(cond, Node)` | Renders the child only when the condition function returns false. |
| `Each[T](items, fn)` | Iterates a slice and renders each item via a mapping function. Generic over `T`. |
| `EachSeq[T](items, fn)` | Same as `Each` but accepts an `iter.Seq[T]` instead of a slice. |
| `Switch(selector, cases)` | Renders one of several named cases based on a runtime selector function. Returns empty string when no case matches. |
| `Entitled(feature, Node)` | Renders the child only when the context's entitlement function grants the named feature. Deny-by-default: returns empty string when no entitlement function is set. |
### Safety guarantees
### Safety Guarantees
- `Text` nodes are always HTML-escaped. XSS via user-supplied strings fed through `Text()` is not possible.
- `Raw` is an explicit escape hatch for trusted content only. Its name signals intent.
- `Entitled` returns an empty string when no entitlement function is set on the context. Access is denied by default, not granted.
- `El` attributes are sorted alphabetically before output, producing deterministic HTML regardless of insertion order.
- Void elements (`br`, `img`, `input`, etc.) never emit a closing tag.
## HLCRF Layout
The `Layout` type is a compositor for five named slots: **H**eader, **L**eft, **C**ontent, **R**ight, **F**ooter. Each slot maps to a specific semantic HTML element and ARIA role:
| Slot | Element | ARIA role |
|------|---------|-----------|
| H | `<header>` | `banner` |
| L | `<aside>` | `complementary` |
| C | `<main>` | `main` |
| R | `<aside>` | `complementary` |
| F | `<footer>` | `contentinfo` |
A layout variant string selects which slots are rendered and in which order:
```go
NewLayout("HLCRF") // all five slots
NewLayout("HCF") // header, content, footer — no sidebars
NewLayout("C") // content only
```
Each rendered slot receives a deterministic `data-block` attribute encoding its position in the tree. The root layout produces IDs in the form `{slot}-0` (e.g., `H-0`, `C-0`). Nested layouts extend the parent's block ID as a path prefix: a `Layout` placed inside the `L` slot of a root layout will produce inner slot IDs like `L-0-H-0`, `L-0-C-0`.
This path scheme is computed without `fmt.Sprintf` — using simple string concatenation — to keep `fmt` out of the WASM import graph.
### Nested layouts
`Layout` implements `Node`, so it can be placed inside any slot of another layout. At render time, nested layouts are cloned and their `path` field is set to the parent's block ID. This clone-on-render approach avoids shared mutation and is safe for concurrent use.
```go
inner := NewLayout("HCF").H(Raw("nav")).C(Raw("body")).F(Raw("links"))
outer := NewLayout("HLCRF").H(Raw("top")).L(inner).C(Raw("main")).F(Raw("foot"))
```
### Fluent builder
All slot methods return the `*Layout` for chaining. Multiple nodes may be appended to the same slot across multiple calls:
```go
NewLayout("HCF").
H(El("h1", Text("Title"))).
C(El("p", Text("Content")), Raw("<hr>")).
F(El("small", Text("Copyright")))
```
## Responsive Compositor
`Responsive` wraps multiple named `Layout` variants for breakpoint-aware rendering. Each variant renders inside a `<div data-variant="name">` container, giving CSS media queries or JavaScript a stable hook for show/hide logic.
```go
NewResponsive().
Variant("desktop", NewLayout("HLCRF")...).
Variant("tablet", NewLayout("HCF")...).
Variant("mobile", NewLayout("C")...)
```
`Responsive` itself implements `Node` and may be passed to `Imprint()` for cross-variant semantic analysis.
Note: `Responsive.Variant()` accepts only `*Layout`, not arbitrary `Node` values. Arbitrary subtrees must be wrapped in a layout first.
- **XSS prevention**: `Text()` nodes always HTML-escape their output via `html.EscapeString()`. User-supplied strings passed through `Text()` cannot inject HTML.
- **Attribute escaping**: Attribute values are escaped with `html.EscapeString()`, handling `&`, `<`, `>`, `"`, and `'`.
- **Deterministic output**: Attribute keys on `El` nodes are sorted alphabetically before rendering, producing identical output regardless of insertion order.
- **Void elements**: A lookup table of 13 void elements (`area`, `base`, `br`, `col`, `embed`, `hr`, `img`, `input`, `link`, `meta`, `source`, `track`, `wbr`) ensures these never emit a closing tag.
- **Deny-by-default entitlements**: `Entitled` returns an empty string when the context is nil, when no entitlement function is set, or when the function returns false. Content is absent from the DOM, not merely hidden.
## Rendering Context
`Context` carries per-request state through the entire node tree:
The `Context` struct carries per-request state through the node tree during rendering:
```go
type Context struct {
Identity string
Locale string
Entitlements func(feature string) bool
Data map[string]any
service *i18n.Service // private; set via NewContextWithService()
Identity string // e.g. user ID or session identifier
Locale string // BCP 47 locale string
Entitlements func(feature string) bool // feature gate callback
Data map[string]any // arbitrary per-request data
service *i18n.Service // unexported; set via constructor
}
```
The `service` field is intentionally unexported. Custom i18n adapter injection requires `NewContextWithService(svc)`. This prevents callers from setting it inconsistently after construction.
Two constructors are provided:
When `ctx.service` is nil, `Text` nodes fall back to the global `i18n.T()` default service.
- `NewContext()` creates a context with sensible defaults and an empty `Data` map.
- `NewContextWithService(svc)` creates a context backed by a specific `i18n.Service` instance.
## Grammar Pipeline
The `service` field is intentionally unexported. When nil, `Text` nodes fall back to the global `i18n.T()` default. This prevents callers from setting the service inconsistently after construction.
The grammar pipeline is a server-side-only feature. It is guarded with `//go:build !js` and absent from all WASM builds.
## HLCRF Layout
The `Layout` type is a compositor for five named slots:
| Slot Letter | Semantic Element | ARIA Role | Accessor |
|-------------|-----------------|-----------|----------|
| H | `<header>` | `banner` | `layout.H(...)` |
| L | `<aside>` | `complementary` | `layout.L(...)` |
| C | `<main>` | `main` | `layout.C(...)` |
| R | `<aside>` | `complementary` | `layout.R(...)` |
| F | `<footer>` | `contentinfo` | `layout.F(...)` |
### Variant String
The variant string passed to `NewLayout()` determines which slots render and in which order:
```go
NewLayout("HLCRF") // all five slots
NewLayout("HCF") // header, content, footer (no sidebars)
NewLayout("C") // content only
NewLayout("LC") // left sidebar and content
```
Slot letters not present in the variant string are ignored, even if nodes have been appended to those slots. Unrecognised characters (lowercase, digits, special characters) are silently skipped -- no error is returned.
### Deterministic Block IDs
Each rendered slot receives a `data-block` attribute encoding its position in the layout tree. At the root level, IDs follow the pattern `{slot}-0`:
```html
<header role="banner" data-block="H-0">...</header>
<main role="main" data-block="C-0">...</main>
<footer role="contentinfo" data-block="F-0">...</footer>
```
Block IDs are constructed by simple string concatenation (no `fmt.Sprintf`) to keep the `fmt` package out of the WASM import graph.
### Nested Layouts
`Layout` implements `Node`, so a layout can be placed inside any slot of another layout. At render time, nested layouts are cloned and their internal `path` field is set to the parent's block ID as a prefix. This produces hierarchical paths:
```go
inner := html.NewLayout("HCF").
H(html.Raw("nav")).
C(html.Raw("body")).
F(html.Raw("links"))
outer := html.NewLayout("HLCRF").
H(html.Raw("top")).
L(inner). // inner layout nested in the Left slot
C(html.Raw("main")).
F(html.Raw("foot"))
```
The inner layout's slots render with prefixed block IDs: `L-0-H-0`, `L-0-C-0`, `L-0-F-0`. At 10 levels of nesting, the deepest block ID becomes `C-0-C-0-C-0-C-0-C-0-C-0-C-0-C-0-C-0-C-0` (tested in `edge_test.go`).
The clone-on-render approach means the original layout is never mutated. This is safe for concurrent use.
### Fluent Builder
All slot methods return `*Layout` for chaining. Multiple nodes can be appended to the same slot across multiple calls:
```go
html.NewLayout("HCF").
H(html.El("h1", html.Text("page.title"))).
C(html.El("p", html.Text("intro"))).
C(html.El("p", html.Text("body"))). // appends to the same C slot
F(html.El("small", html.Text("footer")))
```
### Block ID Parsing
`ParseBlockID()` in `path.go` extracts the slot letter sequence from a `data-block` attribute value:
```go
ParseBlockID("L-0-C-0") // returns ['L', 'C']
ParseBlockID("C-0-C-0-C-0") // returns ['C', 'C', 'C']
ParseBlockID("H-0") // returns ['H']
ParseBlockID("") // returns nil
```
This enables server-side or client-side code to locate a specific block in the rendered tree by its structural path.
## Responsive Compositor
`Responsive` wraps multiple named `Layout` variants for breakpoint-aware rendering:
```go
html.NewResponsive().
Variant("desktop", html.NewLayout("HLCRF").
H(html.Raw("header")).L(html.Raw("nav")).C(html.Raw("main")).
R(html.Raw("aside")).F(html.Raw("footer"))).
Variant("tablet", html.NewLayout("HCF").
H(html.Raw("header")).C(html.Raw("main")).F(html.Raw("footer"))).
Variant("mobile", html.NewLayout("C").
C(html.Raw("main")))
```
Each variant renders inside a `<div data-variant="name">` container. Variants render in insertion order. CSS media queries or JavaScript can target these containers for show/hide logic.
`Responsive` implements `Node`, so it can be passed to `Render()` or `Imprint()`. The `Variant()` method accepts `*Layout` specifically, not arbitrary `Node` values.
Each variant maintains independent block ID namespaces -- nesting a layout inside a responsive variant does not conflict with the same layout structure in another variant.
## Grammar Pipeline (Server-Side Only)
The grammar pipeline is excluded from WASM builds via `//go:build !js` on `pipeline.go`. It bridges the rendering layer to the semantic analysis layer.
### StripTags
`StripTags(html string) string` converts rendered HTML to plain text. Tag boundaries are collapsed to single spaces; the result is trimmed. The implementation is a single-pass rune scanner: no regex, no allocations beyond the output builder. It does not attempt to elide `<script>` or `<style>` content because `go-html` never generates those elements.
```go
func StripTags(html string) string
```
Converts rendered HTML to plain text. Tag boundaries are collapsed into single spaces; the result is trimmed. The implementation is a single-pass rune scanner with no regular expressions and no allocations beyond the output `strings.Builder`. It does not handle `<script>` or `<style>` content because `go-html` never generates those elements.
### Imprint
`Imprint(node Node, ctx *Context) reversal.GrammarImprint` runs the full render-to-analysis pipeline:
```go
func Imprint(node Node, ctx *Context) reversal.GrammarImprint
```
1. Call `node.Render(ctx)` to produce HTML.
2. Pass HTML through `StripTags` to extract plain text.
3. Pass plain text through `go-i18n/reversal.Tokeniser` to produce a token sequence.
4. Wrap tokens in a `reversal.GrammarImprint` for structural analysis.
Runs the full render-to-analysis pipeline:
The resulting `GrammarImprint` exposes `TokenCount`, `UniqueVerbs`, and a `Similar()` method for pairwise semantic similarity scoring. This bridges the rendering layer to the privacy and analytics layers of the Lethean stack.
1. Renders the node tree to HTML via `node.Render(ctx)`.
2. Strips HTML tags via `StripTags()` to extract plain text.
3. Tokenises the text via `go-i18n/reversal.NewTokeniser().Tokenise()`.
4. Wraps tokens in a `reversal.GrammarImprint` for structural analysis.
The resulting `GrammarImprint` exposes `TokenCount`, `UniqueVerbs`, and a `Similar()` method for pairwise semantic similarity scoring.
A nil context is handled gracefully: `Imprint` creates a default context internally.
### CompareVariants
`CompareVariants(r *Responsive, ctx *Context) map[string]float64` runs `Imprint` on each named layout variant in a `Responsive` and returns pairwise similarity scores. Keys are `"name1:name2"`. This enables detection of semantically divergent responsive variants — for example, a mobile layout that strips critical information that appears in the desktop variant.
```go
func CompareVariants(r *Responsive, ctx *Context) map[string]float64
```
## Server/Client Split
Runs `Imprint` independently on each named layout variant in a `Responsive` and returns pairwise similarity scores. Keys are formatted as `"name1:name2"`.
The binary split is enforced by Go build tags.
This enables detection of semantically divergent responsive variants -- for example, a mobile layout that strips critical information present in the desktop variant. Same-content variants with different layout structures (e.g. `HLCRF` vs `HCF`) score above 0.8 similarity.
| File | Build tag | Reason for exclusion from WASM |
|------|-----------|-------------------------------|
| `pipeline.go` | `//go:build !js` | Imports `go-i18n/reversal` (~250 KB gzip) |
| `cmd/wasm/register.go` | `//go:build !js` | Imports `encoding/json` (~200 KB gzip) and `text/template` (~125 KB gzip) |
The WASM binary includes only: node types, layout, responsive, context, render, path, and go-i18n core (translation). No codegen, no pipeline, no JSON, no templates, no `fmt`.
A single-variant `Responsive` produces an empty score map (no pairs to compare).
## WASM Module
The WASM entry point is `cmd/wasm/main.go`, compiled with `GOOS=js GOARCH=wasm`.
It exposes a single JavaScript function on `window.gohtml`:
The WASM entry point at `cmd/wasm/main.go` is compiled with `GOOS=js GOARCH=wasm` and exposes a single JavaScript function:
```js
gohtml.renderToString(variant, locale, slots)
```
- `variant`: HLCRF variant string, e.g. `"HCF"`.
- `locale`: BCP 47 locale string for i18n, e.g. `"en-GB"`.
- `slots`: object with optional keys `H`, `L`, `C`, `R`, `F` containing HTML strings.
**Parameters:**
Slot content is injected via `Raw()`. The caller is responsible for sanitisation. This is intentional: the WASM module is a rendering engine for trusted content produced server-side or by the application's own templates.
- `variant` (string): HLCRF variant string, e.g. `"HCF"`.
- `locale` (string): BCP 47 locale string for i18n, e.g. `"en-GB"`.
- `slots` (object): Optional keys `H`, `L`, `C`, `R`, `F` containing HTML strings.
### Size gate
Slot content is injected via `Raw()`. The caller is responsible for sanitisation -- the WASM module is a rendering engine for trusted content produced server-side or by the application's own templates.
`cmd/wasm/size_test.go` contains `TestWASMBinarySize_Good`, a build-gated test that:
### Size Budget
1. Builds the WASM binary with `-ldflags=-s -w`.
2. Gzip-compresses the output at best compression.
3. Asserts the compressed size is below 1,048,576 bytes (1 MB).
4. Asserts the raw size is below 3,145,728 bytes (3 MB).
The WASM binary has a size gate enforced by `cmd/wasm/size_test.go`:
This test is skipped under `go test -short`. It is guarded with `//go:build !js` so it does not run within the WASM environment itself. Current measured size: 2.90 MB raw, 842 KB gzip.
| Metric | Limit | Current |
|--------|-------|---------|
| Raw binary | 3.5 MB | ~2.90 MB |
| Gzip compressed | 1 MB | ~842 KB |
The test builds the WASM binary as a subprocess and is skipped under `go test -short`. The Makefile `wasm` target performs the same build with size checking.
### Server/Client Split
The binary split is enforced by Go build tags:
| File | Build Tag | Reason for WASM Exclusion |
|------|-----------|--------------------------|
| `pipeline.go` | `!js` | Imports `go-i18n/reversal` |
| `cmd/wasm/register.go` | `!js` | Imports `encoding/json` and `text/template` |
The WASM binary includes only: node types, layout, responsive, context, render, path, and `go-i18n` core translation. No codegen, no pipeline, no JSON, no templates, no `fmt`.
## Codegen CLI
`cmd/codegen/main.go` is a build-time tool for generating Web Component JavaScript bundles from HLCRF slot assignments. It reads a JSON slot map from stdin and writes the generated JS to stdout.
`cmd/codegen/main.go` generates Web Component JavaScript bundles from HLCRF slot assignments at build time:
```bash
echo '{"H":"nav-bar","C":"main-content"}' | go run ./cmd/codegen/ > components.js
echo '{"H":"nav-bar","C":"main-content","F":"page-footer"}' | go run ./cmd/codegen/ > components.js
```
The `codegen` package generates ES2022 class definitions with closed Shadow DOM. The generated pattern per component:
The `codegen` package (`codegen/codegen.go`) generates ES2022 class definitions with closed Shadow DOM. For each custom element tag, it produces:
- A class extending `HTMLElement` with a private `#shadow` field.
- `constructor()` attaches a closed shadow root (`mode: "closed"`).
- `connectedCallback()` dispatches a `wc-ready` custom event with the tag name and slot.
- `render(html)` sets shadow content from a `<template>` clone.
- `customElements.define()` registration.
1. A class extending `HTMLElement` with a private `#shadow` field.
2. `constructor()` attaching a closed shadow root (`mode: "closed"`).
3. `connectedCallback()` dispatching a `wc-ready` custom event with the tag name and slot.
4. `render(html)` method that sets shadow content from a `<template>` clone.
5. A `customElements.define()` registration call.
Closed Shadow DOM provides style isolation. Content is set via the DOM API, never via `innerHTML` directly on the element.
Tag names must contain a hyphen (Web Components specification requirement). `TagToClassName()` converts kebab-case to PascalCase: `nav-bar` becomes `NavBar`, `my-super-widget` becomes `MySuperWidget`.
Tag names must contain a hyphen (Web Components specification requirement). `TagToClassName()` converts kebab-case tags to PascalCase class names: `nav-bar` becomes `NavBar`.
`GenerateBundle()` deduplicates tags -- if the same tag is assigned to multiple slots, only one class definition is emitted.
The codegen CLI uses `encoding/json` and `text/template`, which are excluded from the WASM build. Consumers generate the JS bundle at build time, not at runtime.
The codegen CLI uses `encoding/json` and `text/template`, which are excluded from the WASM build. Consumers generate the JS bundle at build time and serve it as a static asset.
## Block ID Path Scheme
`path.go` exports `ParseBlockID(id string) []byte`, which extracts the slot letter sequence from a `data-block` attribute value.
Format: slots are separated by `-0-`. The sequence `L-0-C-0` decodes to `['L', 'C']`, meaning the content slot of a layout nested inside the left slot.
This scheme is deterministic and human-readable. It enables server-side or client-side code to locate a specific block in the rendered tree by path.
## Dependency Graph
## Data Flow Summary
```
go-html
├── forge.lthn.ai/core/go-i18n (direct, all builds)
│ └── forge.lthn.ai/core/go-inference (indirect)
├── forge.lthn.ai/core/go-i18n/reversal (server builds only, !js)
└── github.com/stretchr/testify (test only)
```
Server-Side
+-------------------+
| |
Node tree -------> Render(ctx) |-----> HTML string
| |
| StripTags() |-----> plain text
| |
| Imprint() |-----> GrammarImprint
| | .TokenCount
| CompareVariants()| .UniqueVerbs
| | .Similar()
+-------------------+
Both `go-i18n` and `go-html` are developed in parallel. The `go.mod` uses a `replace` directive pointing to `../go-i18n`. Both repositories must be present on the local filesystem for builds and tests.
WASM Client
+-------------------+
| |
JS call ---------> renderToString() |-----> HTML string
(variant, locale, | |
slots object) +-------------------+
Build Time
+-------------------+
| |
JSON slot map ---> cmd/codegen/ |-----> Web Component JS
(stdin) | | (stdout)
+-------------------+
```

View file

@ -1,36 +1,48 @@
---
title: Development Guide
description: How to build, test, and contribute to go-html, including WASM builds, benchmarks, coding standards, and test patterns.
---
# Development Guide
## Prerequisites
- Go 1.25 or later (Go workspace required).
- `go-i18n` repository cloned alongside this one: `../go-i18n` relative to the repository root. The `go.mod` `replace` directive points there.
- `go-inference` also resolved via `replace` directive at `../go-inference`. It is an indirect dependency pulled in by `go-i18n`.
- `testify` is the only external test dependency; it is fetched by the Go module system.
- **Go 1.26** or later. The module uses Go 1.26 features (e.g. `range` over integers, `iter.Seq`).
- **go-i18n** cloned alongside this repository at `../go-i18n` relative to the repo root. The `go.mod` `replace` directive points there.
- **go-inference** also resolved via `replace` directive at `../go-inference`. It is an indirect dependency pulled in by `go-i18n`.
- **Go workspace** (`go.work`): this module is part of a shared workspace. Run `go work sync` after cloning.
No additional tools are required for server-side development. WASM builds require a standard Go installation with `GOOS=js GOARCH=wasm` cross-compilation support, which is included in all official Go distributions.
No additional tools are required for server-side development. WASM builds require the standard Go cross-compilation support (`GOOS=js GOARCH=wasm`), included in all official Go distributions.
## Directory Layout
```
go-html/
├── node.go Node interface and all node types
├── layout.go HLCRF compositor
├── pipeline.go StripTags, Imprint, CompareVariants (!js only)
├── responsive.go Multi-variant breakpoint wrapper
├── context.go Rendering context
├── render.go Render() convenience function
├── path.go ParseBlockID() for data-block path decoding
├── codegen/
│ └── codegen.go Web Component JS generation (server-side)
├── cmd/
│ ├── codegen/
│ │ └── main.go Build-time CLI (stdin JSON → stdout JS)
│ └── wasm/
│ ├── main.go WASM entry point (js+wasm build only)
│ ├── register.go buildComponentJS helper (!js only)
│ └── size_test.go WASM binary size gate test (!js only)
└── docs/
└── plans/ Phase design documents (historical)
node.go Node interface and all node types
layout.go HLCRF compositor
pipeline.go StripTags, Imprint, CompareVariants (!js only)
responsive.go Multi-variant breakpoint wrapper
context.go Rendering context
render.go Render() convenience function
path.go ParseBlockID() for data-block path decoding
codegen/
codegen.go Web Component JS generation (server-side)
codegen_test.go Tests for codegen
bench_test.go Codegen benchmarks
cmd/
codegen/
main.go Build-time CLI (stdin JSON, stdout JS)
main_test.go CLI integration tests
wasm/
main.go WASM entry point (js+wasm build only)
register.go buildComponentJS helper (!js only)
register_test.go Tests for register helper
size_test.go WASM binary size gate test (!js only)
dist/ WASM build output (gitignored)
docs/ This documentation
plans/ Phase design documents (historical)
Makefile WASM build with size checking
.core/build.yaml Build system configuration
```
## Running Tests
@ -40,18 +52,32 @@ go-html/
go test ./...
# Single test by name
go test -run TestWASMBinarySize_Good ./cmd/wasm/
go test -run TestElNode_Render .
# Skip slow WASM build test
# Skip the slow WASM build test
go test -short ./...
# Tests with verbose output
# Verbose output
go test -v ./...
# Tests for a specific package
go test ./codegen/
go test ./cmd/codegen/
go test ./cmd/wasm/
```
Tests use `testify` assert and require helpers. Test names follow Go's standard `TestFunctionName` convention. Subtests use `t.Run()` with descriptive names.
The WASM size gate test (`TestWASMBinarySize_Good`) builds the WASM binary as a subprocess. It is slow and is skipped under `-short`. It is also guarded with `//go:build !js` so it cannot run within the WASM environment itself.
The WASM size gate test (`TestWASMBinarySize_Good`) builds the WASM binary as a subprocess and is therefore slow. It is skipped automatically under `-short`. It is also guarded with `//go:build !js` so it cannot run under `GOARCH=wasm`.
### Test Dependencies
Tests use the `testify` library (`assert` and `require` packages). Integration tests and benchmarks that exercise `Text` nodes must initialise the `go-i18n` default service before rendering:
```go
svc, _ := i18n.New()
i18n.SetDefault(svc)
```
The `bench_test.go` file does this in an `init()` function. Individual integration tests do so explicitly.
## Benchmarks
@ -60,26 +86,30 @@ The WASM size gate test (`TestWASMBinarySize_Good`) builds the WASM binary as a
go test -bench . ./...
# Specific benchmark
go test -bench BenchmarkRender_FullPage ./...
go test -bench BenchmarkRender_FullPage .
# With memory allocations
# With memory allocation statistics
go test -bench . -benchmem ./...
# Fixed iteration count
# Extended benchmark duration
go test -bench . -benchtime=5s ./...
```
Benchmarks are organised by operation:
Available benchmark groups:
| Group | Variants |
|-------|---------|
| `BenchmarkRender_*` | Depth 1, 3, 5, 7 trees; full page |
| `BenchmarkLayout_*` | Content-only, HCF, HLCRF, nested, many children |
|-------|----------|
| `BenchmarkRender_*` | Depth 1, 3, 5, 7 element trees; full page with layout |
| `BenchmarkLayout_*` | Content-only, HCF, HLCRF, nested, 50-child slot |
| `BenchmarkEach_*` | 10, 100, 1000 items |
| `BenchmarkResponsive_*` | Three-variant compositor |
| `BenchmarkStripTags_*` | Short and long HTML inputs |
| `BenchmarkImprint_*` | Small and large page trees |
| `BenchmarkCompareVariants_*` | Two and three variant comparison |
| `BenchmarkGenerateClass` | Single Web Component class generation |
| `BenchmarkGenerateBundle_*` | Small (2-slot) and full (5-slot) bundles |
| `BenchmarkTagToClassName` | Kebab-to-PascalCase conversion |
| `BenchmarkGenerateRegistration` | `customElements.define()` call generation |
## WASM Build
@ -87,36 +117,33 @@ Benchmarks are organised by operation:
GOOS=js GOARCH=wasm go build -ldflags="-s -w" -o gohtml.wasm ./cmd/wasm/
```
Strip flags (`-s -w`) are required. Without them the binary is approximately 50% larger.
Strip flags (`-s -w`) are required. Without them, the binary is approximately 50% larger.
The Makefile target `make wasm` performs the build and measures the gzip size:
The Makefile `wasm` target performs the build and checks the output size:
```bash
make wasm
```
The Makefile enforces a 1 MB gzip limit (`WASM_GZ_LIMIT = 1048576`). The build fails if this limit is exceeded.
The Makefile enforces a 1 MB gzip transfer limit and a 3 MB raw size limit. Current measured output: approximately 2.90 MB raw, 842 KB gzip.
To verify the size manually:
To verify the gzip size manually:
```bash
gzip -c -9 gohtml.wasm | wc -c
```
Current measured output: 2.90 MB raw, 842 KB gzip.
## Codegen CLI
The codegen CLI reads a JSON slot map from stdin and writes a Web Component JS bundle to stdout. It is a build-time tool, not intended for runtime use.
The codegen CLI reads a JSON slot map from stdin and writes a Web Component JS bundle to stdout:
```bash
# Generate components for a two-slot layout
echo '{"H":"site-header","C":"app-content","F":"site-footer"}' \
| go run ./cmd/codegen/ \
> components.js
```
The JSON keys are HLCRF slot letters (`H`, `L`, `C`, `R`, `F`). The values are custom element tag names (must contain a hyphen). Duplicate tag values are deduplicated.
JSON keys are HLCRF slot letters (`H`, `L`, `C`, `R`, `F`). Values are custom element tag names (must contain a hyphen per the Web Components specification). Duplicate tag values are deduplicated.
To test the CLI:
@ -130,7 +157,7 @@ go test ./cmd/codegen/
go vet ./...
```
The codebase passes `go vet` with no warnings.
The repository also includes a `.golangci.yml` configuration for `golangci-lint`.
## Coding Standards
@ -138,37 +165,39 @@ The codebase passes `go vet` with no warnings.
UK English throughout: colour, organisation, centre, behaviour, licence (noun), serialise. American spellings are not used.
### Types
### Type Annotations
All exported and unexported functions carry full parameter and return type annotations. The `any` alias is used in preference to `interface{}`.
### Error handling
### HTML Safety
Errors are wrapped with context using `fmt.Errorf("pkg.Function: %w", err)`. The codegen package prefixes all errors with `codegen:`.
### HTML safety
- Use `Text()` for any user-supplied or translated content. It escapes HTML.
- Use `Raw()` only for content you control or have sanitised upstream.
- Use `Text()` for any user-supplied or translated content. It escapes HTML automatically.
- Use `Raw()` only for content you control or have sanitised upstream. Its name explicitly signals "no escaping".
- Never construct HTML by string concatenation in application code.
### Error Handling
Errors are wrapped with context using `fmt.Errorf()`. The codegen package prefixes all errors with `codegen:`.
### Determinism
Output must be deterministic. Attributes are sorted before rendering. `map` iteration in `codegen.GenerateBundle()` may produce non-deterministic class order across runs — this is acceptable because Web Component registration order does not affect correctness.
Output must be deterministic. `El` node attributes are sorted alphabetically before rendering. `map` iteration order in `codegen.GenerateBundle()` may vary across runs -- this is acceptable because Web Component registration order does not affect correctness.
### Build tags
### Build Tags
Files excluded from WASM use `//go:build !js` as the first line, before the `package` declaration. Files compiled only under WASM use `//go:build js && wasm`. Do not use the older `// +build` syntax.
Files excluded from WASM use `//go:build !js` as the first line, before the `package` declaration. Files compiled only under WASM use `//go:build js && wasm`. The older `// +build` syntax is not used.
The `fmt` package must never be imported in files without a `!js` build tag, as it significantly inflates the WASM binary. Use string concatenation instead of `fmt.Sprintf` in layout and node code.
### Licence
All files carry the EUPL-1.2 SPDX identifier:
All new files should carry the EUPL-1.2 SPDX identifier:
```go
// SPDX-Licence-Identifier: EUPL-1.2
```
### Commit format
### Commit Format
Conventional commits with lowercase type and optional scope:
@ -179,7 +208,7 @@ test: add edge case for Unicode surrogate pairs
docs: update architecture with pipeline diagram
```
Commits include a co-author trailer:
Include a co-author trailer:
```
Co-Authored-By: Virgil <virgil@lethean.io>
@ -187,7 +216,7 @@ Co-Authored-By: Virgil <virgil@lethean.io>
## Test Patterns
### Standard unit test
### Standard Unit Test
```go
func TestElNode_Render(t *testing.T) {
@ -201,32 +230,31 @@ func TestElNode_Render(t *testing.T) {
}
```
### Table-driven subtest
### Table-Driven Subtest
```go
func TestStripTags(t *testing.T) {
cases := []struct {
func TestStripTags_Unicode(t *testing.T) {
tests := []struct {
name string
input string
want string
}{
{"empty", "", ""},
{"plain", "hello", "hello"},
{"single tag", "<p>hello</p>", "hello"},
{"nested", "<div><p>a</p><p>b</p></div>", "a b"},
{"emoji in tags", "<span>\U0001F680</span>", "\U0001F680"},
{"RTL in tags", "<div>\u0645\u0631\u062D\u0628\u0627</div>", "\u0645\u0631\u062D\u0628\u0627"},
{"CJK in tags", "<p>\u4F60\u597D\u4E16\u754C</p>", "\u4F60\u597D\u4E16\u754C"},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
got := StripTags(tc.input)
if got != tc.want {
t.Errorf("StripTags(%q) = %q, want %q", tc.input, got, tc.want)
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got := StripTags(tt.input)
if got != tt.want {
t.Errorf("StripTags(%q) = %q, want %q", tt.input, got, tt.want)
}
})
}
}
```
### Integration test with i18n
### Integration Test with i18n
```go
func TestIntegration_RenderThenReverse(t *testing.T) {
@ -247,11 +275,22 @@ func TestIntegration_RenderThenReverse(t *testing.T) {
}
```
Integration tests that exercise the full pipeline (`Imprint`, `CompareVariants`) must initialise the i18n default service before calling `Text` nodes. The `bench_test.go` `init()` function does this for benchmarks; individual integration tests must do so explicitly.
### Codegen Tests with Testify
```go
func TestGenerateClass_Good(t *testing.T) {
js, err := GenerateClass("photo-grid", "C")
require.NoError(t, err)
assert.Contains(t, js, "class PhotoGrid extends HTMLElement")
assert.Contains(t, js, "attachShadow")
assert.Contains(t, js, `mode: "closed"`)
}
```
## Known Limitations
- `NewLayout("XYZ")` silently produces empty output when given unrecognised slot letters. There is no warning or error. Valid slot letters are `H`, `L`, `C`, `R`, `F`.
- `Responsive.Variant()` accepts only `*Layout`, not arbitrary `Node` values. Arbitrary subtrees must be wrapped in a single-slot layout.
- `Context.service` is private. Custom i18n adapter injection requires `NewContextWithService()`. There is no way to set or swap the service after construction.
- `cmd/wasm/main.go` has no integration test for the JS exports. The `size_test.go` file tests binary size only; it does not exercise `renderToString` behaviour.
- `NewLayout("XYZ")` silently produces empty output for unrecognised slot letters. Valid letters are `H`, `L`, `C`, `R`, `F`. There is no error or warning.
- `Responsive.Variant()` accepts only `*Layout`, not arbitrary `Node` values. Arbitrary subtrees must be wrapped in a single-slot layout first.
- `Context.service` is unexported. Custom i18n service injection requires `NewContextWithService()`. There is no way to swap the service after construction.
- The WASM module has no integration test for the JavaScript exports. `size_test.go` tests binary size only; it does not exercise `renderToString` behaviour from JavaScript.
- `codegen.GenerateBundle()` iterates a `map`, so the order of class definitions in the output is non-deterministic. This does not affect correctness but may cause cosmetic diffs between runs.

80
docs/index.md Normal file
View file

@ -0,0 +1,80 @@
---
title: go-html
description: HLCRF DOM compositor with grammar pipeline integration for type-safe server-side HTML generation and optional WASM client rendering.
---
# go-html
`go-html` is a pure-Go library for building HTML documents as type-safe node trees and rendering them to string output. It provides a five-slot layout compositor (Header, Left, Content, Right, Footer -- abbreviated HLCRF), a responsive multi-variant wrapper, a server-side grammar analysis pipeline, a Web Component code generator, and an optional WASM module for client-side rendering.
**Module path:** `forge.lthn.ai/core/go-html`
**Go version:** 1.26
**Licence:** EUPL-1.2
## Quick Start
```go
package main
import html "forge.lthn.ai/core/go-html"
func main() {
page := html.NewLayout("HCF").
H(html.El("nav", html.Text("nav.label"))).
C(html.El("article",
html.El("h1", html.Text("page.title")),
html.Each(items, func(item Item) html.Node {
return html.El("li", html.Text(item.Name))
}),
)).
F(html.El("footer", html.Text("footer.copyright")))
output := page.Render(html.NewContext())
}
```
This builds a Header-Content-Footer layout with semantic HTML elements (`<header>`, `<main>`, `<footer>`), ARIA roles, and deterministic `data-block` path identifiers. Text nodes pass through the `go-i18n` translation layer and are HTML-escaped by default.
## Package Layout
| Path | Purpose |
|------|---------|
| `node.go` | `Node` interface and all node types: `El`, `Text`, `Raw`, `If`, `Unless`, `Each`, `EachSeq`, `Switch`, `Entitled` |
| `layout.go` | HLCRF compositor with semantic HTML elements and ARIA roles |
| `responsive.go` | Multi-variant breakpoint wrapper (`data-variant` containers) |
| `context.go` | Rendering context: identity, locale, entitlements, i18n service |
| `render.go` | `Render()` convenience function |
| `path.go` | `ParseBlockID()` for decoding `data-block` path attributes |
| `pipeline.go` | `StripTags`, `Imprint`, `CompareVariants` (server-side only, `!js` build tag) |
| `codegen/codegen.go` | Web Component class generation (closed Shadow DOM) |
| `cmd/codegen/main.go` | Build-time CLI: JSON slot map on stdin, JS bundle on stdout |
| `cmd/wasm/main.go` | WASM entry point exporting `renderToString()` to JavaScript |
## Key Concepts
**Node tree** -- All renderable units implement `Node`, a single-method interface: `Render(ctx *Context) string`. The library composes nodes into trees using `El()` for elements, `Text()` for translated text, and control-flow constructors (`If`, `Unless`, `Each`, `Switch`, `Entitled`).
**HLCRF Layout** -- A five-slot compositor that maps to semantic HTML: `<header>` (H), `<aside>` (L/R), `<main>` (C), `<footer>` (F). The variant string controls which slots render: `"HLCRF"` for all five, `"HCF"` for three, `"C"` for content only. Layouts nest: placing a `Layout` inside another layout's slot produces hierarchical `data-block` paths like `L-0-C-0`.
**Responsive variants** -- `Responsive` wraps multiple `Layout` instances with named breakpoints (e.g. `"desktop"`, `"mobile"`). Each variant renders inside a `<div data-variant="name">` container for CSS or JavaScript targeting.
**Grammar pipeline** -- Server-side only. `Imprint()` renders a node tree to HTML, strips tags, tokenises the plain text via `go-i18n/reversal`, and returns a `GrammarImprint` for semantic analysis. `CompareVariants()` computes pairwise similarity scores across responsive variants.
**Web Component codegen** -- `cmd/codegen/` generates ES2022 Web Component classes with closed Shadow DOM from a JSON slot-to-tag mapping. This is a build-time tool, not used at runtime.
## Dependencies
```
forge.lthn.ai/core/go-html
forge.lthn.ai/core/go-i18n (direct, all builds)
forge.lthn.ai/core/go-inference (indirect, via go-i18n)
forge.lthn.ai/core/go-i18n/reversal (server builds only, !js)
github.com/stretchr/testify (test only)
```
Both `go-i18n` and `go-inference` must be present on the local filesystem. The `go.mod` uses `replace` directives pointing to sibling directories (`../go-i18n`, `../go-inference`).
## Further Reading
- [Architecture](architecture.md) -- Node interface, HLCRF layout internals, responsive compositor, grammar pipeline, WASM module, codegen CLI
- [Development](development.md) -- Building, testing, benchmarks, WASM builds, coding standards, contribution guide