feat: WoRF — Word Radiance Field experiments

NeRF-inspired technique for learning relational dynamics of language. Not what words mean, but how they behave together — rhythm, pacing, punctuation patterns, style transitions. v1: positional field over text (baseline, memorises) v2: masked feature prediction (relational, actually works) Trained on Wodehouse "My Man Jeeves" (public domain, Gutenberg). All 11 style features are highly relational — the field learns that Wodehouse's style is a tightly coupled system. Key finding: style interpolation between narrative and dialogue produces sensible predictions for unmeasured features, suggesting the continuous field captures real structural patterns. Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-04 09:43:38 +00:00 · 2026-03-04 09:43:38 +00:00 · f79eaabdce
commit f79eaabdce
parent 41d8008e69
6 changed files with 20480 additions and 0 deletions
--- a/docs/plans/2026-03-04-worf-design.md
+++ b/docs/plans/2026-03-04-worf-design.md
@ -0,0 +1,162 @@
+# WoRF — Word Radiance Fields
+
+> **Status**: Experimental proof-of-concept (4 Mar 2026)
+> **Licence**: EUPL-1.2
+
+## What This Is
+
+WoRF (Word Radiance Field) is a technique inspired by NeRF (Neural Radiance
+Fields) for learning the **relational dynamics** of language from text.
+
+Not what words mean — how they behave together. The pauses, the rhythm,
+the texture. The stuff current token embeddings lose entirely.
+
+A WoRF learns a continuous field over stylistic features extracted from
+text. You can query the field to understand how style dimensions relate
+to each other within a body of writing. The goal: teach models not WHAT
+to say, but HOW to say it.
+
+## Origin
+
+The idea comes from a simple observation: current LLMs start with a single
+flat embedding per token and rely on transformer layers to reconstruct
+all the relational richness of language. That works for content, but
+loses performance — timing, rhythm, word-choice patterns, deliberate
+silences. The "gooey" stuff.
+
+NeRF's core trick is: given sparse discrete observations, learn a
+continuous function you can query at any point. A page of text is a
+sparse observation of language relationships. A book is a scene.
+The WoRF is the learned field.
+
+## How It Works
+
+### Feature Extraction
+
+Each chunk of text (~300 words) is measured across 11 stylistic dimensions:
+
+| Feature | What It Captures |
+|---------|-----------------|
+| avg_word_length | Vocabulary complexity |
+| avg_sentence_length | Pacing |
+| sentence_length_variance | Rhythm variation |
+| dialogue_ratio | Conversation density |
+| vocabulary_richness | Unique word usage |
+| dash_density | Parenthetical style (asides, interjections) |
+| exclamation_density | Emotional intensity |
+| question_density | Interrogative patterns |
+| short_sentence_ratio | Punchiness |
+| aside_density | Digression patterns |
+| avg_punct_per_sentence | Structural complexity |
+
+### Architecture (v2 — Masked Feature Prediction)
+
+Instead of mapping text position to features (v1, just memorises),
+v2 uses the features themselves as coordinates:
+
+```
+Input:  11 features with one masked (zeroed + flag)
+        Each feature gets sinusoidal positional encoding (6 frequencies)
+Output: Predicted values for all 11 features
+Loss:   MSE between predicted and actual
+```
+
+The network learns: "given THESE style characteristics, what must the
+missing one be?" Each masking angle is like a different camera view
+in NeRF — it reveals a different relationship in the field.
+
+Architecture: 6-layer MLP, 256 hidden dim, GELU activations, dropout
+at midpoint. AdamW with cosine annealing. ~4000 epochs.
+
+### What the Field Reveals
+
+Trained on Wodehouse's "My Man Jeeves" (169 chunks, 50K words):
+
+**Every feature is highly relational** — none are independent. The
+field can predict any feature from the other 10 with near-zero error.
+This means Wodehouse's style is a tightly coupled system, not random.
+
+**Key relationships discovered:**
+
+- `aside_density` ↔ `avg_punct_per_sentence` (+0.32) — his parenthetical
+  asides ARE the signature style
+- `short_sentence_ratio` → `exclamation_density` (+0.16) — punchy
+  sentences come with Bertie's exclamations ("What!" / "Ripping!")
+- `avg_sentence_length` → `short_sentence_ratio` (-0.29) — long
+  sentences = exposition, short = dialogue reactions
+- `sentence_length_variance` → `avg_punct_per_sentence` (+0.15) —
+  varied rhythm = more structural punctuation
+
+**Style interpolation works:** Walking from narrative to dialogue,
+the field correctly predicts question density rises 4x, punctuation
+per sentence drops, exclamations increase. Not memorisation — the
+field understands style transitions.
+
+## What This Is For
+
+### Near-term: Training Data Quality
+
+WoRF features could score training corpus quality — not for correctness
+but for **stylistic consistency and richness**. A chunk that doesn't
+fit the field = low quality or genre mismatch.
+
+### Medium-term: EN-GB Language Pack
+
+Feed many public domain books through WoRF to build a style field for
+"native English." The field captures how English actually flows across
+authors, genres, eras. Use it as auxiliary training signal — not what
+the model says, but whether it sounds like real English.
+
+### Long-term: Style-Aware Generation
+
+Query the WoRF during generation to guide style. "Write this with
+Wodehouse's rhythm" = constrain the output to the region of style
+space that Wodehouse occupies. Different from fine-tuning — it's a
+continuous field you can blend and interpolate.
+
+## Relationship to LEM
+
+WoRF connects to existing LEM work:
+
+- **go-i18n grammar engine** — the 19D/24D scoring dimensions could
+  serve as WoRF "viewing angles" (the directional component NeRF uses)
+- **Poindexter** — spatial indexing via KD-Tree, already doing proximity
+  in embedding space. WoRF adds a style dimension to that space
+- **Sandwich format** — WoRF features could become additional scoring
+  layers in the training curriculum
+- **CL-BPL** (cymatic-linguistic back-propagation) — same wave
+  interference maths NeRF uses for reconstruction
+
+## Files
+
+```
+tasks/worf.txt                 # Original Grok chat transcript (concept)
+tasks/worf-experiment.md       # Experiment notes
+tasks/worf-experiment.py       # v1: position → features (memorised, useful baseline)
+tasks/worf-v2.py               # v2: masked feature prediction (relational field)
+tasks/worf-field-jeeves.json   # v1 field data
+tasks/worf-v2-relations.json   # v2 influence matrix
+tasks/pg-wood.txt              # Source: My Man Jeeves (Gutenberg, public domain)
+```
+
+## Next Steps
+
+1. Add more public domain books (Wilde, Austen, Twain, Poe) and see
+   if the field distinguishes authors or finds universal English patterns
+2. Increase feature dimensions — add n-gram patterns, word frequency
+   distributions, clause structure
+3. Connect to go-i18n scoring as "viewing angle" dimensions
+4. Test as training data quality filter on existing LEM datasets
+5. Explore whether the influence matrix itself is useful as a compact
+   style representation (11x11 = 121 numbers to describe an author)
+
+## Easter Egg
+
+WoRF is named after Commander Worf, but the real reference is Data's
+"little life forms" song to Spot. The idea: a model that can understand
+why Eckhart Tolle is funny without being prompted, because it learned
+the pause is the punchline.
+
+---
+
+*EUPL-1.2 — Lethean Network*
--- a/experiments/worf/pg-wood.txt
+++ b/experiments/worf/pg-wood.txt
--- a/experiments/worf/v1_positional.py
+++ b/experiments/worf/v1_positional.py
@ -0,0 +1,390 @@
+#!/usr/bin/env python3
+"""
+WoRF Experiment — Word Radiance Field
+======================================
+
+Feed Wodehouse's "My Man Jeeves" into a NeRF-like MLP and see what
+the continuous field learns about writing style.
+
+NeRF: (x, y, z, θ, φ) → (r, g, b, σ)
+WoRF: (position_in_text, chunk_context) → (style_features)
+
+First pass: 1D position → style feature vector. No viewing angle yet.
+Just see if a continuous field over text position learns anything.
+"""
+
+import re
+import math
+import json
+import torch
+import torch.nn as nn
+import numpy as np
+from pathlib import Path
+from collections import Counter
+
+# ---------------------------------------------------------------------------
+# 1. Text Splitting — each "page" is one observation (like one photo for NeRF)
+# ---------------------------------------------------------------------------
+
+def load_and_clean(path: str) -> str:
+    """Strip Gutenberg header/footer, return clean text."""
+    text = Path(path).read_text(encoding="utf-8")
+    # Strip PG header
+    start = text.find("LEAVE IT TO JEEVES")
+    if start == -1:
+        start = text.find("*** START OF")
+        start = text.find("\n", start) + 1
+    # Strip PG footer
+    end = text.find("*** END OF THE PROJECT GUTENBERG")
+    if end == -1:
+        end = len(text)
+    return text[start:end].strip()
+
+
+def split_into_chunks(text: str, chunk_size: int = 500) -> list[str]:
+    """Split text into roughly equal word-count chunks (pages)."""
+    words = text.split()
+    chunks = []
+    for i in range(0, len(words), chunk_size):
+        chunk = " ".join(words[i:i + chunk_size])
+        if len(chunk.split()) > 50:  # skip tiny trailing chunks
+            chunks.append(chunk)
+    return chunks
+
+
+# ---------------------------------------------------------------------------
+# 2. Feature Extraction — what we measure about each "page"
+# ---------------------------------------------------------------------------
+
+def extract_features(chunk: str) -> dict:
+    """Extract stylistic features from a chunk of text.
+
+    These are the 'RGB + density' equivalent — what the field predicts.
+    """
+    words = chunk.split()
+    sentences = re.split(r'[.!?]+', chunk)
+    sentences = [s.strip() for s in sentences if s.strip()]
+
+    word_lengths = [len(w.strip(".,;:!?\"'()—-")) for w in words]
+    word_lengths = [l for l in word_lengths if l > 0]
+
+    # Dialogue detection
+    dialogue_chars = sum(1 for c in chunk if c == '"')
+    total_chars = len(chunk) or 1
+
+    # Punctuation patterns (Wodehouse loves dashes and exclamations)
+    dashes = chunk.count("—") + chunk.count("--")
+    exclamations = chunk.count("!")
+    questions = chunk.count("?")
+    commas = chunk.count(",")
+
+    # Vocabulary richness (unique words / total words)
+    unique_words = len(set(w.lower().strip(".,;:!?\"'()—-") for w in words))
+    total_words = len(words) or 1
+
+    # Sentence length variation (std dev) — captures rhythm
+    sent_lengths = [len(s.split()) for s in sentences]
+    sent_mean = np.mean(sent_lengths) if sent_lengths else 0
+    sent_std = np.std(sent_lengths) if sent_lengths else 0
+
+    # Short sentence ratio (punchy lines like "Injudicious, sir.")
+    short_sentences = sum(1 for l in sent_lengths if l <= 5)
+    short_ratio = short_sentences / (len(sent_lengths) or 1)
+
+    # Aside/parenthetical density (commas, dashes per word)
+    aside_density = (commas + dashes) / total_words
+
+    return {
+        "avg_word_length": np.mean(word_lengths) if word_lengths else 0,
+        "avg_sentence_length": sent_mean,
+        "sentence_length_variance": sent_std,
+        "dialogue_ratio": dialogue_chars / total_chars,
+        "vocabulary_richness": unique_words / total_words,
+        "dash_density": dashes / total_words,
+        "exclamation_density": exclamations / total_words,
+        "question_density": questions / total_words,
+        "short_sentence_ratio": short_ratio,
+        "aside_density": aside_density,
+        "avg_punct_per_sentence": (commas + dashes + exclamations + questions) / (len(sent_lengths) or 1),
+    }
+
+
+FEATURE_NAMES = list(extract_features("dummy text here for keys").keys())
+NUM_FEATURES = len(FEATURE_NAMES)
+
+
+# ---------------------------------------------------------------------------
+# 3. Positional Encoding — NeRF's trick for capturing high-frequency detail
+# ---------------------------------------------------------------------------
+
+def positional_encoding(x: torch.Tensor, num_frequencies: int = 10) -> torch.Tensor:
+    """NeRF-style sinusoidal positional encoding.
+
+    Maps a scalar position into a higher-dimensional space so the MLP
+    can learn sharp transitions (same reason NeRF needs it for edges).
+    """
+    encodings = [x]
+    for freq in range(num_frequencies):
+        encodings.append(torch.sin(2.0 ** freq * math.pi * x))
+        encodings.append(torch.cos(2.0 ** freq * math.pi * x))
+    return torch.cat(encodings, dim=-1)
+
+
+# ---------------------------------------------------------------------------
+# 4. The WoRF Network — tiny MLP, same architecture as vanilla NeRF
+# ---------------------------------------------------------------------------
+
+class WoRF(nn.Module):
+    """Word Radiance Field — learns a continuous style field over text position."""
+
+    def __init__(self, input_dim: int, hidden_dim: int = 128, num_layers: int = 4,
+                 output_dim: int = NUM_FEATURES):
+        super().__init__()
+
+        layers = []
+        layers.append(nn.Linear(input_dim, hidden_dim))
+        layers.append(nn.ReLU())
+
+        for i in range(num_layers - 2):
+            layers.append(nn.Linear(hidden_dim, hidden_dim))
+            layers.append(nn.ReLU())
+            # Skip connection at midpoint (like NeRF)
+            if i == (num_layers - 2) // 2 - 1:
+                self.skip_layer_idx = len(layers)
+
+        layers.append(nn.Linear(hidden_dim, output_dim))
+
+        self.network = nn.Sequential(*layers)
+        self.input_dim = input_dim
+        self.hidden_dim = hidden_dim
+
+    def forward(self, x: torch.Tensor) -> torch.Tensor:
+        return self.network(x)
+
+
+# ---------------------------------------------------------------------------
+# 5. Training
+# ---------------------------------------------------------------------------
+
+def prepare_data(chunks: list[str]) -> tuple[torch.Tensor, torch.Tensor, dict]:
+    """Convert chunks to training data: positions → features."""
+    n = len(chunks)
+    positions = []
+    features = []
+
+    for i, chunk in enumerate(chunks):
+        pos = i / (n - 1)  # normalise to [0, 1]
+        feat = extract_features(chunk)
+        positions.append(pos)
+        features.append([feat[k] for k in FEATURE_NAMES])
+
+    positions = torch.tensor(positions, dtype=torch.float32).unsqueeze(-1)
+    features = torch.tensor(features, dtype=torch.float32)
+
+    # Normalise features to [0, 1] range for training stability
+    feat_min = features.min(dim=0).values
+    feat_max = features.max(dim=0).values
+    feat_range = feat_max - feat_min
+    feat_range[feat_range == 0] = 1.0  # avoid division by zero
+    features_norm = (features - feat_min) / feat_range
+
+    norm_stats = {"min": feat_min, "max": feat_max, "range": feat_range}
+
+    return positions, features_norm, norm_stats
+
+
+def train_worf(positions: torch.Tensor, features: torch.Tensor,
+               num_frequencies: int = 10, epochs: int = 2000, lr: float = 1e-3):
+    """Train the WoRF field."""
+    # Encode positions
+    encoded = positional_encoding(positions, num_frequencies)
+    input_dim = encoded.shape[-1]
+
+    model = WoRF(input_dim=input_dim, output_dim=features.shape[-1])
+    optimiser = torch.optim.Adam(model.parameters(), lr=lr)
+    loss_fn = nn.MSELoss()
+
+    print(f"\nTraining WoRF: {len(positions)} chunks, {input_dim}D input, {features.shape[-1]} features")
+    print(f"Positional encoding frequencies: {num_frequencies}")
+    print("-" * 60)
+
+    for epoch in range(epochs):
+        pred = model(encoded)
+        loss = loss_fn(pred, features)
+
+        optimiser.zero_grad()
+        loss.backward()
+        optimiser.step()
+
+        if epoch % 200 == 0 or epoch == epochs - 1:
+            print(f"  Epoch {epoch:4d}/{epochs}  Loss: {loss.item():.6f}")
+
+    return model, num_frequencies
+
+
+# ---------------------------------------------------------------------------
+# 6. Query the Field — the interesting bit
+# ---------------------------------------------------------------------------
+
+def query_field(model: WoRF, num_frequencies: int, num_points: int = 500) -> np.ndarray:
+    """Query the learned field at many points, including between training samples."""
+    positions = torch.linspace(0, 1, num_points).unsqueeze(-1)
+    encoded = positional_encoding(positions, num_frequencies)
+
+    with torch.no_grad():
+        predictions = model(encoded).numpy()
+
+    return positions.squeeze().numpy(), predictions
+
+
+def analyse_field(positions: np.ndarray, predictions: np.ndarray,
+                  norm_stats: dict, chunks: list[str]):
+    """Analyse what the field learned."""
+    print("\n" + "=" * 60)
+    print("FIELD ANALYSIS")
+    print("=" * 60)
+
+    # Denormalise for interpretability
+    feat_min = norm_stats["min"].numpy()
+    feat_range = norm_stats["range"].numpy()
+    predictions_real = predictions * feat_range + feat_min
+
+    # Find peaks and valleys for each feature
+    print("\nFeature dynamics across the book:")
+    print("-" * 60)
+
+    for i, name in enumerate(FEATURE_NAMES):
+        values = predictions_real[:, i]
+        peak_pos = positions[np.argmax(values)]
+        valley_pos = positions[np.argmin(values)]
+        mean_val = np.mean(values)
+        std_val = np.std(values)
+        dynamic_range = np.max(values) - np.min(values)
+
+        print(f"  {name:30s}  mean={mean_val:.4f}  std={std_val:.4f}  "
+              f"range={dynamic_range:.4f}  peak@{peak_pos:.2f}  valley@{valley_pos:.2f}")
+
+    # Find story boundaries by looking for sharp transitions
+    print("\n\nSharp transitions (potential story/scene boundaries):")
+    print("-" * 60)
+
+    # Use total gradient magnitude across all features
+    gradients = np.diff(predictions, axis=0)
+    gradient_magnitude = np.sqrt(np.sum(gradients ** 2, axis=1))
+
+    # Find top transition points
+    top_transitions = np.argsort(gradient_magnitude)[-8:]  # top 8 (roughly one per story)
+    top_transitions = np.sort(top_transitions)
+
+    for idx in top_transitions:
+        pos = positions[idx]
+        # Estimate which chunk this corresponds to
+        chunk_idx = int(pos * (len(chunks) - 1))
+        chunk_preview = chunks[min(chunk_idx, len(chunks) - 1)][:80]
+        print(f"  Position {pos:.3f} (magnitude {gradient_magnitude[idx]:.4f})")
+        print(f"    Text: \"{chunk_preview}...\"")
+        print()
+
+    # Compare dialogue-heavy vs narrative-heavy regions
+    print("\nDialogue vs Narrative rhythm:")
+    print("-" * 60)
+
+    dialogue_idx = FEATURE_NAMES.index("dialogue_ratio")
+    sent_var_idx = FEATURE_NAMES.index("sentence_length_variance")
+    short_idx = FEATURE_NAMES.index("short_sentence_ratio")
+
+    # Split into quartiles
+    n = len(positions)
+    for q, label in [(0, "Opening"), (1, "Early-mid"), (2, "Late-mid"), (3, "Closing")]:
+        start = q * n // 4
+        end = (q + 1) * n // 4
+        avg_dialogue = np.mean(predictions_real[start:end, dialogue_idx])
+        avg_variance = np.mean(predictions_real[start:end, sent_var_idx])
+        avg_short = np.mean(predictions_real[start:end, short_idx])
+        print(f"  {label:12s}  dialogue={avg_dialogue:.4f}  "
+              f"sent_variance={avg_variance:.4f}  short_ratio={avg_short:.4f}")
+
+    # Interpolation test — what does the field predict BETWEEN chunks?
+    print("\n\nInterpolation test (querying between training points):")
+    print("-" * 60)
+    print("The field predicts style features at positions where no text exists.")
+    print("If interpolation is smooth and sensible, the field learned structure.")
+    print("If it's noisy/random, it just memorised individual chunks.")
+
+    # Check smoothness: average absolute second derivative
+    second_deriv = np.diff(predictions, n=2, axis=0)
+    smoothness = np.mean(np.abs(second_deriv))
+    print(f"\n  Smoothness score (lower = smoother): {smoothness:.6f}")
+
+    if smoothness < 0.01:
+        print("  → Very smooth field — learned continuous style patterns")
+    elif smoothness < 0.05:
+        print("  → Moderately smooth — some structure learned")
+    else:
+        print("  → Rough field — mostly memorised chunks")
+
+    return predictions_real
+
+
+# ---------------------------------------------------------------------------
+# 7. Save results for later
+# ---------------------------------------------------------------------------
+
+def save_results(positions, predictions_real, output_path):
+    """Save the field data as JSON for potential visualisation later."""
+    results = {
+        "positions": positions.tolist(),
+        "features": {
+            name: predictions_real[:, i].tolist()
+            for i, name in enumerate(FEATURE_NAMES)
+        },
+        "feature_names": FEATURE_NAMES,
+        "description": "WoRF continuous field over Wodehouse's 'My Man Jeeves'",
+    }
+    Path(output_path).write_text(json.dumps(results, indent=2))
+    print(f"\nField data saved to {output_path}")
+
+
+# ---------------------------------------------------------------------------
+# Main
+# ---------------------------------------------------------------------------
+
+def main():
+    book_path = Path(__file__).parent / "pg-wood.txt"
+
+    print("WoRF — Word Radiance Field Experiment")
+    print("=" * 60)
+    print(f"Source: {book_path.name}")
+
+    # Load and split
+    text = load_and_clean(str(book_path))
+    print(f"Clean text: {len(text):,} characters, {len(text.split()):,} words")
+
+    chunks = split_into_chunks(text, chunk_size=300)
+    print(f"Chunks: {len(chunks)} (≈300 words each)")
+
+    # Prepare training data
+    positions, features, norm_stats = prepare_data(chunks)
+    print(f"Feature dimensions: {NUM_FEATURES}")
+    print(f"Features: {', '.join(FEATURE_NAMES)}")
+
+    # Train
+    model, num_freq = train_worf(positions, features, epochs=3000)
+
+    # Query the continuous field
+    query_positions, predictions = query_field(model, num_freq, num_points=1000)
+
+    # Analyse
+    predictions_real = analyse_field(query_positions, predictions, norm_stats, chunks)
+
+    # Save
+    save_results(query_positions, predictions_real,
+                 str(book_path.parent / "worf-field-jeeves.json"))
+
+    print("\n" + "=" * 60)
+    print("Done. The field exists. Poke it and see what it tells you.")
+    print("=" * 60)
+
+
+if __name__ == "__main__":
+    main()
--- a/experiments/worf/v2_relational.py
+++ b/experiments/worf/v2_relational.py
@ -0,0 +1,474 @@
+#!/usr/bin/env python3
+"""
+WoRF v2 — Word Radiance Field (Feature-Space)
+===============================================
+
+v1 used position-in-book as coordinates → just memorised chunks.
+v2 uses the style features themselves as the coordinate system.
+
+The field learns relationships BETWEEN style dimensions:
+"When dialogue is high and sentences are short, what happens to
+vocabulary richness and aside density?"
+
+That's the relational wording data — not what words mean,
+but how they behave together. The stuff a language pack needs.
+
+Multi-book ready: each book is more "photos" of the same style field.
+"""
+
+import re
+import math
+import json
+import torch
+import torch.nn as nn
+import numpy as np
+from pathlib import Path
+from collections import Counter
+from dataclasses import dataclass
+
+WORF_DIR = Path(__file__).parent
+
+
+# ---------------------------------------------------------------------------
+# 1. Feature Extraction (same as v1, proven to work)
+# ---------------------------------------------------------------------------
+
+FEATURE_NAMES = [
+    "avg_word_length",
+    "avg_sentence_length",
+    "sentence_length_variance",
+    "dialogue_ratio",
+    "vocabulary_richness",
+    "dash_density",
+    "exclamation_density",
+    "question_density",
+    "short_sentence_ratio",
+    "aside_density",
+    "avg_punct_per_sentence",
+]
+NUM_FEATURES = len(FEATURE_NAMES)
+
+
+def extract_features(chunk: str) -> list[float]:
+    """Extract stylistic features from a chunk of text."""
+    words = chunk.split()
+    sentences = re.split(r'[.!?]+', chunk)
+    sentences = [s.strip() for s in sentences if s.strip()]
+
+    word_lengths = [len(w.strip(".,;:!?\"'()—-")) for w in words]
+    word_lengths = [wl for wl in word_lengths if wl > 0]
+
+    dialogue_chars = sum(1 for c in chunk if c == '"')
+    total_chars = len(chunk) or 1
+
+    dashes = chunk.count("—") + chunk.count("--")
+    exclamations = chunk.count("!")
+    questions = chunk.count("?")
+    commas = chunk.count(",")
+
+    unique_words = len(set(w.lower().strip(".,;:!?\"'()—-") for w in words))
+    total_words = len(words) or 1
+
+    sent_lengths = [len(s.split()) for s in sentences]
+    sent_mean = float(np.mean(sent_lengths)) if sent_lengths else 0.0
+    sent_std = float(np.std(sent_lengths)) if sent_lengths else 0.0
+
+    short_sentences = sum(1 for sl in sent_lengths if sl <= 5)
+    short_ratio = short_sentences / (len(sent_lengths) or 1)
+
+    aside_density = (commas + dashes) / total_words
+
+    return [
+        float(np.mean(word_lengths)) if word_lengths else 0.0,
+        sent_mean,
+        sent_std,
+        dialogue_chars / total_chars,
+        unique_words / total_words,
+        dashes / total_words,
+        exclamations / total_words,
+        questions / total_words,
+        short_ratio,
+        aside_density,
+        (commas + dashes + exclamations + questions) / (len(sent_lengths) or 1),
+    ]
+
+
+# ---------------------------------------------------------------------------
+# 2. Text Loading (multi-book ready)
+# ---------------------------------------------------------------------------
+
+@dataclass
+class BookChunk:
+    text: str
+    features: list[float]
+    book: str
+    chunk_idx: int
+    position: float  # 0-1 position within book
+
+
+def load_gutenberg(path: str, title: str) -> list[BookChunk]:
+    """Load a Gutenberg text, split into chunks, extract features."""
+    text = Path(path).read_text(encoding="utf-8")
+
+    # Strip PG header/footer
+    for marker in ["*** START OF THE PROJECT GUTENBERG EBOOK",
+                    "*** START OF THIS PROJECT GUTENBERG EBOOK"]:
+        idx = text.find(marker)
+        if idx != -1:
+            text = text[text.find("\n", idx) + 1:]
+            break
+
+    end = text.find("*** END OF THE PROJECT GUTENBERG")
+    if end != -1:
+        text = text[:end]
+
+    text = text.strip()
+    words = text.split()
+    chunk_size = 300
+    chunks = []
+
+    for i in range(0, len(words), chunk_size):
+        chunk_text = " ".join(words[i:i + chunk_size])
+        if len(chunk_text.split()) > 50:
+            chunks.append(chunk_text)
+
+    results = []
+    for i, chunk_text in enumerate(chunks):
+        results.append(BookChunk(
+            text=chunk_text,
+            features=extract_features(chunk_text),
+            book=title,
+            chunk_idx=i,
+            position=i / max(len(chunks) - 1, 1),
+        ))
+
+    print(f"  {title}: {len(results)} chunks from {len(words):,} words")
+    return results
+
+
+# ---------------------------------------------------------------------------
+# 3. WoRF v2 — Masked Feature Prediction
+# ---------------------------------------------------------------------------
+#
+# Instead of position → features, we do:
+#   features_with_one_masked → predict_all_features
+#
+# This learns the RELATIONSHIPS between style dimensions.
+# Like a denoising autoencoder where each mask reveals a different
+# relationship. Like NeRF views — each masking angle shows a different
+# aspect of the same underlying field.
+
+def positional_encoding(x: torch.Tensor, num_frequencies: int = 6) -> torch.Tensor:
+    """Sinusoidal encoding for continuous feature values."""
+    encodings = [x]
+    for freq in range(num_frequencies):
+        encodings.append(torch.sin(2.0 ** freq * math.pi * x))
+        encodings.append(torch.cos(2.0 ** freq * math.pi * x))
+    return torch.cat(encodings, dim=-1)
+
+
+class WoRFv2(nn.Module):
+    """Word Radiance Field v2 — learns inter-feature relationships.
+
+    Input: N features (one zeroed out) + mask indicator per feature
+    Output: predicted values for all features
+
+    The network learns: given these style characteristics,
+    what must the missing one be? That's the relational field.
+    """
+
+    def __init__(self, num_features: int, num_frequencies: int = 6,
+                 hidden_dim: int = 256, num_layers: int = 6):
+        super().__init__()
+
+        self.num_features = num_features
+        self.num_frequencies = num_frequencies
+        per_feature_dim = 1 + 2 * num_frequencies  # encoded value
+        input_dim = num_features * (per_feature_dim + 1)  # +1 for mask flag
+
+        layers = []
+        layers.append(nn.Linear(input_dim, hidden_dim))
+        layers.append(nn.GELU())
+
+        for i in range(num_layers - 2):
+            layers.append(nn.Linear(hidden_dim, hidden_dim))
+            layers.append(nn.GELU())
+            if i == num_layers // 2 - 2:
+                layers.append(nn.Dropout(0.05))
+
+        layers.append(nn.Linear(hidden_dim, num_features))
+
+        self.network = nn.Sequential(*layers)
+
+    def encode_input(self, features: torch.Tensor, mask_idx: torch.Tensor) -> torch.Tensor:
+        """Encode features with positional encoding + mask flags."""
+        encoded_parts = []
+
+        for f in range(self.num_features):
+            feat_val = features[:, f:f+1]
+            feat_encoded = positional_encoding(feat_val, self.num_frequencies)
+
+            is_masked = (mask_idx == f).float().unsqueeze(-1)
+            feat_encoded = feat_encoded * (1.0 - is_masked)
+
+            feat_with_mask = torch.cat([feat_encoded, is_masked], dim=-1)
+            encoded_parts.append(feat_with_mask)
+
+        return torch.cat(encoded_parts, dim=-1)
+
+    def forward(self, features: torch.Tensor, mask_idx: torch.Tensor) -> torch.Tensor:
+        encoded = self.encode_input(features, mask_idx)
+        return self.network(encoded)
+
+
+# ---------------------------------------------------------------------------
+# 4. Training
+# ---------------------------------------------------------------------------
+
+def train_worf_v2(chunks: list[BookChunk], epochs: int = 3000, lr: float = 5e-4):
+    """Train WoRF v2 with random feature masking."""
+    features = torch.tensor([c.features for c in chunks], dtype=torch.float32)
+
+    feat_min = features.min(dim=0).values
+    feat_max = features.max(dim=0).values
+    feat_range = feat_max - feat_min
+    feat_range[feat_range == 0] = 1.0
+    features_norm = (features - feat_min) / feat_range
+    norm_stats = {"min": feat_min, "max": feat_max, "range": feat_range}
+
+    model = WoRFv2(num_features=NUM_FEATURES)
+    optimiser = torch.optim.AdamW(model.parameters(), lr=lr, weight_decay=1e-4)
+    scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimiser, T_max=epochs)
+    loss_fn = nn.MSELoss()
+
+    n_chunks = len(chunks)
+    print(f"\nTraining WoRF v2: {n_chunks} chunks, {NUM_FEATURES} features")
+    print(f"Architecture: masked feature prediction (like masked autoencoder)")
+    print("-" * 60)
+
+    best_loss = float("inf")
+
+    for epoch in range(epochs):
+        mask_idx = torch.randint(0, NUM_FEATURES, (n_chunks,))
+
+        pred = model(features_norm, mask_idx)
+        loss = loss_fn(pred, features_norm)
+
+        optimiser.zero_grad()
+        loss.backward()
+        optimiser.step()
+        scheduler.step()
+
+        if loss.item() < best_loss:
+            best_loss = loss.item()
+
+        if epoch % 300 == 0 or epoch == epochs - 1:
+            print(f"  Epoch {epoch:4d}/{epochs}  Loss: {loss.item():.6f}  "
+                  f"Best: {best_loss:.6f}  LR: {scheduler.get_last_lr()[0]:.6f}")
+
+    return model, features_norm, norm_stats
+
+
+# ---------------------------------------------------------------------------
+# 5. Analysis
+# ---------------------------------------------------------------------------
+
+def probe_relationships(model: WoRFv2, features_norm: torch.Tensor, norm_stats: dict):
+    """Probe what the field learned about feature relationships."""
+    print("\n" + "=" * 60)
+    print("RELATIONAL FIELD ANALYSIS")
+    print("=" * 60)
+
+    model.eval()
+
+    # --- Test 1: Feature predictability ---
+    print("\nFeature predictability (lower error = stronger relationship to others):")
+    print("-" * 60)
+
+    feature_errors = {}
+    with torch.no_grad():
+        for f in range(NUM_FEATURES):
+            mask_idx = torch.full((len(features_norm),), f, dtype=torch.long)
+            pred = model(features_norm, mask_idx)
+            error = torch.mean((pred[:, f] - features_norm[:, f]) ** 2).item()
+            feature_errors[FEATURE_NAMES[f]] = error
+
+    sorted_features = sorted(feature_errors.items(), key=lambda x: x[1])
+    for name, error in sorted_features:
+        bar_len = int((1 - min(error * 20, 1)) * 40)
+        bar = "#" * bar_len
+        predictability = "highly relational" if error < 0.01 else \
+                         "moderately relational" if error < 0.05 else "independent"
+        print(f"  {name:30s}  error={error:.5f}  [{bar:40s}]  {predictability}")
+
+    # --- Test 2: Feature influence matrix ---
+    print("\n\nFeature influence matrix:")
+    print("(When feature X increases, what happens to feature Y?)")
+    print("-" * 60)
+
+    influence_matrix = np.zeros((NUM_FEATURES, NUM_FEATURES))
+
+    with torch.no_grad():
+        baseline = features_norm.mean(dim=0, keepdim=True)
+
+        for source_f in range(NUM_FEATURES):
+            high = baseline.clone()
+            low = baseline.clone()
+            high[0, source_f] = 0.9
+            low[0, source_f] = 0.1
+
+            for target_f in range(NUM_FEATURES):
+                if target_f == source_f:
+                    continue
+                mask = torch.tensor([target_f])
+                pred_high = model(high, mask)[0, target_f].item()
+                pred_low = model(low, mask)[0, target_f].item()
+                influence_matrix[source_f, target_f] = pred_high - pred_low
+
+    # Print matrix
+    short_names = [n[:8] for n in FEATURE_NAMES]
+    print(f"\n  {'':30s}", end="")
+    for sn in short_names:
+        print(f" {sn:>8s}", end="")
+    print()
+
+    for i, name in enumerate(FEATURE_NAMES):
+        print(f"  {name:30s}", end="")
+        for j in range(NUM_FEATURES):
+            val = influence_matrix[i, j]
+            if i == j:
+                print(f"      ---", end="")
+            elif abs(val) > 0.15:
+                print(f" {val:+.2f}*", end="")
+            else:
+                print(f" {val:+.3f}", end="")
+        print()
+
+    # --- Test 3: Style interpolation ---
+    print("\n\nStyle interpolation (walking through the field):")
+    print("-" * 60)
+    print("Interpolating between 'narrative exposition' and 'snappy dialogue':\n")
+
+    with torch.no_grad():
+        narrative = baseline.clone()
+        narrative[0, FEATURE_NAMES.index("dialogue_ratio")] = 0.05
+        narrative[0, FEATURE_NAMES.index("avg_sentence_length")] = 0.8
+        narrative[0, FEATURE_NAMES.index("short_sentence_ratio")] = 0.1
+        narrative[0, FEATURE_NAMES.index("vocabulary_richness")] = 0.8
+
+        dialogue = baseline.clone()
+        dialogue[0, FEATURE_NAMES.index("dialogue_ratio")] = 0.9
+        dialogue[0, FEATURE_NAMES.index("avg_sentence_length")] = 0.2
+        dialogue[0, FEATURE_NAMES.index("short_sentence_ratio")] = 0.8
+        dialogue[0, FEATURE_NAMES.index("vocabulary_richness")] = 0.4
+
+        predict_features = [
+            FEATURE_NAMES.index("exclamation_density"),
+            FEATURE_NAMES.index("question_density"),
+            FEATURE_NAMES.index("dash_density"),
+            FEATURE_NAMES.index("aside_density"),
+            FEATURE_NAMES.index("avg_punct_per_sentence"),
+        ]
+
+        print(f"  {'blend':>5s}", end="")
+        for name in ["excl_dens", "quest_dens", "dash_dens", "aside_dens", "punct/sent"]:
+            print(f"  {name:>10s}", end="")
+        print()
+        print(f"  {'':>5s}{'':->55s}")
+
+        for alpha in np.linspace(0, 1, 11):
+            blended = narrative * (1 - alpha) + dialogue * alpha
+            predictions = []
+            for pf in predict_features:
+                mask = torch.tensor([pf])
+                pred = model(blended, mask)[0, pf].item()
+                pred_real = pred * norm_stats["range"][pf].item() + norm_stats["min"][pf].item()
+                predictions.append(pred_real)
+
+            label = "narr" if alpha < 0.3 else "dial" if alpha > 0.7 else "mix"
+            print(f"  {alpha:4.1f}{label:1s}", end="")
+            for p in predictions:
+                print(f"  {p:10.4f}", end="")
+            print()
+
+    # --- Test 4: Reconstruction accuracy ---
+    print("\n\nReconstruction accuracy per feature:")
+    print("-" * 60)
+
+    with torch.no_grad():
+        total_error = 0
+        total_count = 0
+
+        for f in range(NUM_FEATURES):
+            mask = torch.full((len(features_norm),), f, dtype=torch.long)
+            pred = model(features_norm, mask)
+            errors = (pred[:, f] - features_norm[:, f]) ** 2
+            rmse_real = math.sqrt(errors.mean().item()) * norm_stats["range"][f].item()
+            total_error += errors.sum().item()
+            total_count += len(errors)
+            print(f"  {FEATURE_NAMES[f]:30s}  RMSE (real units): {rmse_real:.4f}")
+
+        avg_error = total_error / total_count
+        print(f"\n  Overall MSE: {avg_error:.6f}")
+        print(f"  Overall RMSE: {math.sqrt(avg_error):.4f}")
+
+    return influence_matrix
+
+
+# ---------------------------------------------------------------------------
+# 6. Save
+# ---------------------------------------------------------------------------
+
+def save_results(influence_matrix: np.ndarray, output_path: str):
+    """Save the influence matrix and metadata."""
+    results = {
+        "feature_names": FEATURE_NAMES,
+        "influence_matrix": influence_matrix.tolist(),
+        "description": "WoRF v2: inter-feature influence matrix from masked prediction",
+        "interpretation": "influence_matrix[i][j] = when feature i goes high, "
+                         "how much does the predicted value of feature j change",
+    }
+    Path(output_path).write_text(json.dumps(results, indent=2))
+    print(f"\nResults saved to {output_path}")
+
+
+# ---------------------------------------------------------------------------
+# Main
+# ---------------------------------------------------------------------------
+
+def main():
+    print("WoRF v2 — Word Radiance Field (Relational)")
+    print("=" * 60)
+
+    all_chunks = []
+
+    book_path = WORF_DIR / "pg-wood.txt"
+    if book_path.exists():
+        all_chunks.extend(load_gutenberg(str(book_path), "My Man Jeeves"))
+
+    # Add more books here:
+    # all_chunks.extend(load_gutenberg("pg-wilde.txt", "Importance of Being Earnest"))
+    # all_chunks.extend(load_gutenberg("pg-austen.txt", "Pride and Prejudice"))
+
+    if not all_chunks:
+        print("No books found!")
+        return
+
+    books = set(c.book for c in all_chunks)
+    print(f"\nTotal: {len(all_chunks)} chunks from {len(books)} book(s)")
+
+    model, features_norm, norm_stats = train_worf_v2(all_chunks, epochs=4000)
+
+    influence_matrix = probe_relationships(model, features_norm, norm_stats)
+
+    save_results(influence_matrix, str(WORF_DIR / "worf-v2-relations.json"))
+
+    print("\n" + "=" * 60)
+    print("The relational field exists.")
+    print("This is what Wodehouse's English 'feels like' in feature space.")
+    print("Add more books to build toward an EN-GB WoRF language pack.")
+    print("=" * 60)
+
+
+if __name__ == "__main__":
+    main()
--- a/experiments/worf/worf-field-jeeves.json
+++ b/experiments/worf/worf-field-jeeves.json
--- a/experiments/worf/worf-v2-relations.json
+++ b/experiments/worf/worf-v2-relations.json
@ -0,0 +1,162 @@
+{
+  "feature_names": [
+    "avg_word_length",
+    "avg_sentence_length",
+    "sentence_length_variance",
+    "dialogue_ratio",
+    "vocabulary_richness",
+    "dash_density",
+    "exclamation_density",
+    "question_density",
+    "short_sentence_ratio",
+    "aside_density",
+    "avg_punct_per_sentence"
+  ],
+  "influence_matrix": [
+    [
+      0.0,
+      -0.03247326612472534,
+      -0.0239107608795166,
+      -0.00048324093222618103,
+      0.1107892394065857,
+      0.015222892165184021,
+      -0.024353697896003723,
+      0.02327282726764679,
+      0.055540263652801514,
+      0.04952073097229004,
+      -0.018031805753707886
+    ],
+    [
+      -0.11262395977973938,
+      0.0,
+      0.1966363489627838,
+      0.0003904178738594055,
+      -0.02297872304916382,
+      -0.068694107234478,
+      -0.12937799841165543,
+      -0.19205902516841888,
+      -0.29318100214004517,
+      -0.09364050626754761,
+      0.21115505695343018
+    ],
+    [
+      0.005609989166259766,
+      0.13626961410045624,
+      0.0,
+      -0.0007154941558837891,
+      -0.02271491289138794,
+      0.005668185651302338,
+      -0.0020959973335266113,
+      -0.01791289448738098,
+      0.04299241304397583,
+      0.03149789571762085,
+      0.153947114944458
+    ],
+    [
+      -0.01625087857246399,
+      0.012996375560760498,
+      0.004404813051223755,
+      0.0,
+      -0.004828751087188721,
+      -0.010406054556369781,
+      0.012377187609672546,
+      -0.007560417056083679,
+      0.017317771911621094,
+      -0.006858497858047485,
+      0.013844549655914307
+    ],
+    [
+      0.05449041724205017,
+      -0.002728700637817383,
+      0.03543153405189514,
+      -0.0007495768368244171,
+      0.0,
+      0.02357766404747963,
+      -0.06922292709350586,
+      -0.01401202380657196,
+      0.03409099578857422,
+      -0.022808074951171875,
+      -0.06983467936515808
+    ],
+    [
+      0.05502724647521973,
+      -0.028156444430351257,
+      0.016653388738632202,
+      -0.0004658550024032593,
+      0.008968591690063477,
+      0.0,
+      0.07332807779312134,
+      0.004690051078796387,
+      0.004198431968688965,
+      0.1471288800239563,
+      0.1343848705291748
+    ],
+    [
+      -0.008408337831497192,
+      -0.03403817117214203,
+      -0.03511646389961243,
+      0.0002146884799003601,
+      0.01336967945098877,
+      0.012008734047412872,
+      0.0,
+      -0.038716867566108704,
+      0.01683211326599121,
+      0.015300273895263672,
+      0.038202375173568726
+    ],
+    [
+      -0.04866918921470642,
+      -0.09030131995677948,
+      -0.08065217733383179,
+      0.0006130747497081757,
+      -0.04372537136077881,
+      0.035463668406009674,
+      0.020850971341133118,
+      0.0,
+      0.06807422637939453,
+      0.04871469736099243,
+      0.015091657638549805
+    ],
+    [
+      0.07264012098312378,
+      -0.17126457393169403,
+      0.007805615663528442,
+      0.0005212798714637756,
+      -0.07545053958892822,
+      -0.011027880012989044,
+      0.16361884027719498,
+      0.1303078681230545,
+      0.0,
+      0.08242395520210266,
+      -0.042179644107818604
+    ],
+    [
+      0.05252787470817566,
+      -0.06419773399829865,
+      0.006353020668029785,
+      -0.0005619712173938751,
+      -0.03329026699066162,
+      0.04053857922554016,
+      0.05099382996559143,
+      0.0370599627494812,
+      0.05590474605560303,
+      0.0,
+      0.22894394397735596
+    ],
+    [
+      -0.011781513690948486,
+      0.0985381007194519,
+      0.09538811445236206,
+      -0.00027120113372802734,
+      -0.0469667911529541,
+      0.04663299024105072,
+      0.04154162108898163,
+      0.0520768016576767,
+      -0.12925076484680176,
+      0.32439711689949036,
+      0.0
+    ]
+  ],
+  "description": "WoRF v2: inter-feature influence matrix from masked prediction",
+  "interpretation": "influence_matrix[i][j] = when feature i goes high, how much does the predicted value of feature j change"
+}