feat: WoRF — Word Radiance Field experiments
NeRF-inspired technique for learning relational dynamics of language. Not what words mean, but how they behave together — rhythm, pacing, punctuation patterns, style transitions. v1: positional field over text (baseline, memorises) v2: masked feature prediction (relational, actually works) Trained on Wodehouse "My Man Jeeves" (public domain, Gutenberg). All 11 style features are highly relational — the field learns that Wodehouse's style is a tightly coupled system. Key finding: style interpolation between narrative and dialogue produces sensible predictions for unmeasured features, suggesting the continuous field captures real structural patterns. Co-Authored-By: Virgil <virgil@lethean.io>
This commit is contained in:
parent
41d8008e69
commit
f79eaabdce
6 changed files with 20480 additions and 0 deletions
162
docs/plans/2026-03-04-worf-design.md
Normal file
162
docs/plans/2026-03-04-worf-design.md
Normal file
|
|
@ -0,0 +1,162 @@
|
|||
# WoRF — Word Radiance Fields
|
||||
|
||||
> **Status**: Experimental proof-of-concept (4 Mar 2026)
|
||||
> **Licence**: EUPL-1.2
|
||||
|
||||
## What This Is
|
||||
|
||||
WoRF (Word Radiance Field) is a technique inspired by NeRF (Neural Radiance
|
||||
Fields) for learning the **relational dynamics** of language from text.
|
||||
|
||||
Not what words mean — how they behave together. The pauses, the rhythm,
|
||||
the texture. The stuff current token embeddings lose entirely.
|
||||
|
||||
A WoRF learns a continuous field over stylistic features extracted from
|
||||
text. You can query the field to understand how style dimensions relate
|
||||
to each other within a body of writing. The goal: teach models not WHAT
|
||||
to say, but HOW to say it.
|
||||
|
||||
## Origin
|
||||
|
||||
The idea comes from a simple observation: current LLMs start with a single
|
||||
flat embedding per token and rely on transformer layers to reconstruct
|
||||
all the relational richness of language. That works for content, but
|
||||
loses performance — timing, rhythm, word-choice patterns, deliberate
|
||||
silences. The "gooey" stuff.
|
||||
|
||||
NeRF's core trick is: given sparse discrete observations, learn a
|
||||
continuous function you can query at any point. A page of text is a
|
||||
sparse observation of language relationships. A book is a scene.
|
||||
The WoRF is the learned field.
|
||||
|
||||
## How It Works
|
||||
|
||||
### Feature Extraction
|
||||
|
||||
Each chunk of text (~300 words) is measured across 11 stylistic dimensions:
|
||||
|
||||
| Feature | What It Captures |
|
||||
|---------|-----------------|
|
||||
| avg_word_length | Vocabulary complexity |
|
||||
| avg_sentence_length | Pacing |
|
||||
| sentence_length_variance | Rhythm variation |
|
||||
| dialogue_ratio | Conversation density |
|
||||
| vocabulary_richness | Unique word usage |
|
||||
| dash_density | Parenthetical style (asides, interjections) |
|
||||
| exclamation_density | Emotional intensity |
|
||||
| question_density | Interrogative patterns |
|
||||
| short_sentence_ratio | Punchiness |
|
||||
| aside_density | Digression patterns |
|
||||
| avg_punct_per_sentence | Structural complexity |
|
||||
|
||||
### Architecture (v2 — Masked Feature Prediction)
|
||||
|
||||
Instead of mapping text position to features (v1, just memorises),
|
||||
v2 uses the features themselves as coordinates:
|
||||
|
||||
```
|
||||
Input: 11 features with one masked (zeroed + flag)
|
||||
Each feature gets sinusoidal positional encoding (6 frequencies)
|
||||
Output: Predicted values for all 11 features
|
||||
Loss: MSE between predicted and actual
|
||||
```
|
||||
|
||||
The network learns: "given THESE style characteristics, what must the
|
||||
missing one be?" Each masking angle is like a different camera view
|
||||
in NeRF — it reveals a different relationship in the field.
|
||||
|
||||
Architecture: 6-layer MLP, 256 hidden dim, GELU activations, dropout
|
||||
at midpoint. AdamW with cosine annealing. ~4000 epochs.
|
||||
|
||||
### What the Field Reveals
|
||||
|
||||
Trained on Wodehouse's "My Man Jeeves" (169 chunks, 50K words):
|
||||
|
||||
**Every feature is highly relational** — none are independent. The
|
||||
field can predict any feature from the other 10 with near-zero error.
|
||||
This means Wodehouse's style is a tightly coupled system, not random.
|
||||
|
||||
**Key relationships discovered:**
|
||||
|
||||
- `aside_density` ↔ `avg_punct_per_sentence` (+0.32) — his parenthetical
|
||||
asides ARE the signature style
|
||||
- `short_sentence_ratio` → `exclamation_density` (+0.16) — punchy
|
||||
sentences come with Bertie's exclamations ("What!" / "Ripping!")
|
||||
- `avg_sentence_length` → `short_sentence_ratio` (-0.29) — long
|
||||
sentences = exposition, short = dialogue reactions
|
||||
- `sentence_length_variance` → `avg_punct_per_sentence` (+0.15) —
|
||||
varied rhythm = more structural punctuation
|
||||
|
||||
**Style interpolation works:** Walking from narrative to dialogue,
|
||||
the field correctly predicts question density rises 4x, punctuation
|
||||
per sentence drops, exclamations increase. Not memorisation — the
|
||||
field understands style transitions.
|
||||
|
||||
## What This Is For
|
||||
|
||||
### Near-term: Training Data Quality
|
||||
|
||||
WoRF features could score training corpus quality — not for correctness
|
||||
but for **stylistic consistency and richness**. A chunk that doesn't
|
||||
fit the field = low quality or genre mismatch.
|
||||
|
||||
### Medium-term: EN-GB Language Pack
|
||||
|
||||
Feed many public domain books through WoRF to build a style field for
|
||||
"native English." The field captures how English actually flows across
|
||||
authors, genres, eras. Use it as auxiliary training signal — not what
|
||||
the model says, but whether it sounds like real English.
|
||||
|
||||
### Long-term: Style-Aware Generation
|
||||
|
||||
Query the WoRF during generation to guide style. "Write this with
|
||||
Wodehouse's rhythm" = constrain the output to the region of style
|
||||
space that Wodehouse occupies. Different from fine-tuning — it's a
|
||||
continuous field you can blend and interpolate.
|
||||
|
||||
## Relationship to LEM
|
||||
|
||||
WoRF connects to existing LEM work:
|
||||
|
||||
- **go-i18n grammar engine** — the 19D/24D scoring dimensions could
|
||||
serve as WoRF "viewing angles" (the directional component NeRF uses)
|
||||
- **Poindexter** — spatial indexing via KD-Tree, already doing proximity
|
||||
in embedding space. WoRF adds a style dimension to that space
|
||||
- **Sandwich format** — WoRF features could become additional scoring
|
||||
layers in the training curriculum
|
||||
- **CL-BPL** (cymatic-linguistic back-propagation) — same wave
|
||||
interference maths NeRF uses for reconstruction
|
||||
|
||||
## Files
|
||||
|
||||
```
|
||||
tasks/worf.txt # Original Grok chat transcript (concept)
|
||||
tasks/worf-experiment.md # Experiment notes
|
||||
tasks/worf-experiment.py # v1: position → features (memorised, useful baseline)
|
||||
tasks/worf-v2.py # v2: masked feature prediction (relational field)
|
||||
tasks/worf-field-jeeves.json # v1 field data
|
||||
tasks/worf-v2-relations.json # v2 influence matrix
|
||||
tasks/pg-wood.txt # Source: My Man Jeeves (Gutenberg, public domain)
|
||||
```
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. Add more public domain books (Wilde, Austen, Twain, Poe) and see
|
||||
if the field distinguishes authors or finds universal English patterns
|
||||
2. Increase feature dimensions — add n-gram patterns, word frequency
|
||||
distributions, clause structure
|
||||
3. Connect to go-i18n scoring as "viewing angle" dimensions
|
||||
4. Test as training data quality filter on existing LEM datasets
|
||||
5. Explore whether the influence matrix itself is useful as a compact
|
||||
style representation (11x11 = 121 numbers to describe an author)
|
||||
|
||||
## Easter Egg
|
||||
|
||||
WoRF is named after Commander Worf, but the real reference is Data's
|
||||
"little life forms" song to Spot. The idea: a model that can understand
|
||||
why Eckhart Tolle is funny without being prompted, because it learned
|
||||
the pause is the punchline.
|
||||
|
||||
---
|
||||
|
||||
*EUPL-1.2 — Lethean Network*
|
||||
7250
experiments/worf/pg-wood.txt
Normal file
7250
experiments/worf/pg-wood.txt
Normal file
File diff suppressed because it is too large
Load diff
390
experiments/worf/v1_positional.py
Normal file
390
experiments/worf/v1_positional.py
Normal file
|
|
@ -0,0 +1,390 @@
|
|||
#!/usr/bin/env python3
|
||||
"""
|
||||
WoRF Experiment — Word Radiance Field
|
||||
======================================
|
||||
|
||||
Feed Wodehouse's "My Man Jeeves" into a NeRF-like MLP and see what
|
||||
the continuous field learns about writing style.
|
||||
|
||||
NeRF: (x, y, z, θ, φ) → (r, g, b, σ)
|
||||
WoRF: (position_in_text, chunk_context) → (style_features)
|
||||
|
||||
First pass: 1D position → style feature vector. No viewing angle yet.
|
||||
Just see if a continuous field over text position learns anything.
|
||||
"""
|
||||
|
||||
import re
|
||||
import math
|
||||
import json
|
||||
import torch
|
||||
import torch.nn as nn
|
||||
import numpy as np
|
||||
from pathlib import Path
|
||||
from collections import Counter
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# 1. Text Splitting — each "page" is one observation (like one photo for NeRF)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def load_and_clean(path: str) -> str:
|
||||
"""Strip Gutenberg header/footer, return clean text."""
|
||||
text = Path(path).read_text(encoding="utf-8")
|
||||
# Strip PG header
|
||||
start = text.find("LEAVE IT TO JEEVES")
|
||||
if start == -1:
|
||||
start = text.find("*** START OF")
|
||||
start = text.find("\n", start) + 1
|
||||
# Strip PG footer
|
||||
end = text.find("*** END OF THE PROJECT GUTENBERG")
|
||||
if end == -1:
|
||||
end = len(text)
|
||||
return text[start:end].strip()
|
||||
|
||||
|
||||
def split_into_chunks(text: str, chunk_size: int = 500) -> list[str]:
|
||||
"""Split text into roughly equal word-count chunks (pages)."""
|
||||
words = text.split()
|
||||
chunks = []
|
||||
for i in range(0, len(words), chunk_size):
|
||||
chunk = " ".join(words[i:i + chunk_size])
|
||||
if len(chunk.split()) > 50: # skip tiny trailing chunks
|
||||
chunks.append(chunk)
|
||||
return chunks
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# 2. Feature Extraction — what we measure about each "page"
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def extract_features(chunk: str) -> dict:
|
||||
"""Extract stylistic features from a chunk of text.
|
||||
|
||||
These are the 'RGB + density' equivalent — what the field predicts.
|
||||
"""
|
||||
words = chunk.split()
|
||||
sentences = re.split(r'[.!?]+', chunk)
|
||||
sentences = [s.strip() for s in sentences if s.strip()]
|
||||
|
||||
word_lengths = [len(w.strip(".,;:!?\"'()—-")) for w in words]
|
||||
word_lengths = [l for l in word_lengths if l > 0]
|
||||
|
||||
# Dialogue detection
|
||||
dialogue_chars = sum(1 for c in chunk if c == '"')
|
||||
total_chars = len(chunk) or 1
|
||||
|
||||
# Punctuation patterns (Wodehouse loves dashes and exclamations)
|
||||
dashes = chunk.count("—") + chunk.count("--")
|
||||
exclamations = chunk.count("!")
|
||||
questions = chunk.count("?")
|
||||
commas = chunk.count(",")
|
||||
|
||||
# Vocabulary richness (unique words / total words)
|
||||
unique_words = len(set(w.lower().strip(".,;:!?\"'()—-") for w in words))
|
||||
total_words = len(words) or 1
|
||||
|
||||
# Sentence length variation (std dev) — captures rhythm
|
||||
sent_lengths = [len(s.split()) for s in sentences]
|
||||
sent_mean = np.mean(sent_lengths) if sent_lengths else 0
|
||||
sent_std = np.std(sent_lengths) if sent_lengths else 0
|
||||
|
||||
# Short sentence ratio (punchy lines like "Injudicious, sir.")
|
||||
short_sentences = sum(1 for l in sent_lengths if l <= 5)
|
||||
short_ratio = short_sentences / (len(sent_lengths) or 1)
|
||||
|
||||
# Aside/parenthetical density (commas, dashes per word)
|
||||
aside_density = (commas + dashes) / total_words
|
||||
|
||||
return {
|
||||
"avg_word_length": np.mean(word_lengths) if word_lengths else 0,
|
||||
"avg_sentence_length": sent_mean,
|
||||
"sentence_length_variance": sent_std,
|
||||
"dialogue_ratio": dialogue_chars / total_chars,
|
||||
"vocabulary_richness": unique_words / total_words,
|
||||
"dash_density": dashes / total_words,
|
||||
"exclamation_density": exclamations / total_words,
|
||||
"question_density": questions / total_words,
|
||||
"short_sentence_ratio": short_ratio,
|
||||
"aside_density": aside_density,
|
||||
"avg_punct_per_sentence": (commas + dashes + exclamations + questions) / (len(sent_lengths) or 1),
|
||||
}
|
||||
|
||||
|
||||
FEATURE_NAMES = list(extract_features("dummy text here for keys").keys())
|
||||
NUM_FEATURES = len(FEATURE_NAMES)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# 3. Positional Encoding — NeRF's trick for capturing high-frequency detail
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def positional_encoding(x: torch.Tensor, num_frequencies: int = 10) -> torch.Tensor:
|
||||
"""NeRF-style sinusoidal positional encoding.
|
||||
|
||||
Maps a scalar position into a higher-dimensional space so the MLP
|
||||
can learn sharp transitions (same reason NeRF needs it for edges).
|
||||
"""
|
||||
encodings = [x]
|
||||
for freq in range(num_frequencies):
|
||||
encodings.append(torch.sin(2.0 ** freq * math.pi * x))
|
||||
encodings.append(torch.cos(2.0 ** freq * math.pi * x))
|
||||
return torch.cat(encodings, dim=-1)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# 4. The WoRF Network — tiny MLP, same architecture as vanilla NeRF
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class WoRF(nn.Module):
|
||||
"""Word Radiance Field — learns a continuous style field over text position."""
|
||||
|
||||
def __init__(self, input_dim: int, hidden_dim: int = 128, num_layers: int = 4,
|
||||
output_dim: int = NUM_FEATURES):
|
||||
super().__init__()
|
||||
|
||||
layers = []
|
||||
layers.append(nn.Linear(input_dim, hidden_dim))
|
||||
layers.append(nn.ReLU())
|
||||
|
||||
for i in range(num_layers - 2):
|
||||
layers.append(nn.Linear(hidden_dim, hidden_dim))
|
||||
layers.append(nn.ReLU())
|
||||
# Skip connection at midpoint (like NeRF)
|
||||
if i == (num_layers - 2) // 2 - 1:
|
||||
self.skip_layer_idx = len(layers)
|
||||
|
||||
layers.append(nn.Linear(hidden_dim, output_dim))
|
||||
|
||||
self.network = nn.Sequential(*layers)
|
||||
self.input_dim = input_dim
|
||||
self.hidden_dim = hidden_dim
|
||||
|
||||
def forward(self, x: torch.Tensor) -> torch.Tensor:
|
||||
return self.network(x)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# 5. Training
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def prepare_data(chunks: list[str]) -> tuple[torch.Tensor, torch.Tensor, dict]:
|
||||
"""Convert chunks to training data: positions → features."""
|
||||
n = len(chunks)
|
||||
positions = []
|
||||
features = []
|
||||
|
||||
for i, chunk in enumerate(chunks):
|
||||
pos = i / (n - 1) # normalise to [0, 1]
|
||||
feat = extract_features(chunk)
|
||||
positions.append(pos)
|
||||
features.append([feat[k] for k in FEATURE_NAMES])
|
||||
|
||||
positions = torch.tensor(positions, dtype=torch.float32).unsqueeze(-1)
|
||||
features = torch.tensor(features, dtype=torch.float32)
|
||||
|
||||
# Normalise features to [0, 1] range for training stability
|
||||
feat_min = features.min(dim=0).values
|
||||
feat_max = features.max(dim=0).values
|
||||
feat_range = feat_max - feat_min
|
||||
feat_range[feat_range == 0] = 1.0 # avoid division by zero
|
||||
features_norm = (features - feat_min) / feat_range
|
||||
|
||||
norm_stats = {"min": feat_min, "max": feat_max, "range": feat_range}
|
||||
|
||||
return positions, features_norm, norm_stats
|
||||
|
||||
|
||||
def train_worf(positions: torch.Tensor, features: torch.Tensor,
|
||||
num_frequencies: int = 10, epochs: int = 2000, lr: float = 1e-3):
|
||||
"""Train the WoRF field."""
|
||||
# Encode positions
|
||||
encoded = positional_encoding(positions, num_frequencies)
|
||||
input_dim = encoded.shape[-1]
|
||||
|
||||
model = WoRF(input_dim=input_dim, output_dim=features.shape[-1])
|
||||
optimiser = torch.optim.Adam(model.parameters(), lr=lr)
|
||||
loss_fn = nn.MSELoss()
|
||||
|
||||
print(f"\nTraining WoRF: {len(positions)} chunks, {input_dim}D input, {features.shape[-1]} features")
|
||||
print(f"Positional encoding frequencies: {num_frequencies}")
|
||||
print("-" * 60)
|
||||
|
||||
for epoch in range(epochs):
|
||||
pred = model(encoded)
|
||||
loss = loss_fn(pred, features)
|
||||
|
||||
optimiser.zero_grad()
|
||||
loss.backward()
|
||||
optimiser.step()
|
||||
|
||||
if epoch % 200 == 0 or epoch == epochs - 1:
|
||||
print(f" Epoch {epoch:4d}/{epochs} Loss: {loss.item():.6f}")
|
||||
|
||||
return model, num_frequencies
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# 6. Query the Field — the interesting bit
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def query_field(model: WoRF, num_frequencies: int, num_points: int = 500) -> np.ndarray:
|
||||
"""Query the learned field at many points, including between training samples."""
|
||||
positions = torch.linspace(0, 1, num_points).unsqueeze(-1)
|
||||
encoded = positional_encoding(positions, num_frequencies)
|
||||
|
||||
with torch.no_grad():
|
||||
predictions = model(encoded).numpy()
|
||||
|
||||
return positions.squeeze().numpy(), predictions
|
||||
|
||||
|
||||
def analyse_field(positions: np.ndarray, predictions: np.ndarray,
|
||||
norm_stats: dict, chunks: list[str]):
|
||||
"""Analyse what the field learned."""
|
||||
print("\n" + "=" * 60)
|
||||
print("FIELD ANALYSIS")
|
||||
print("=" * 60)
|
||||
|
||||
# Denormalise for interpretability
|
||||
feat_min = norm_stats["min"].numpy()
|
||||
feat_range = norm_stats["range"].numpy()
|
||||
predictions_real = predictions * feat_range + feat_min
|
||||
|
||||
# Find peaks and valleys for each feature
|
||||
print("\nFeature dynamics across the book:")
|
||||
print("-" * 60)
|
||||
|
||||
for i, name in enumerate(FEATURE_NAMES):
|
||||
values = predictions_real[:, i]
|
||||
peak_pos = positions[np.argmax(values)]
|
||||
valley_pos = positions[np.argmin(values)]
|
||||
mean_val = np.mean(values)
|
||||
std_val = np.std(values)
|
||||
dynamic_range = np.max(values) - np.min(values)
|
||||
|
||||
print(f" {name:30s} mean={mean_val:.4f} std={std_val:.4f} "
|
||||
f"range={dynamic_range:.4f} peak@{peak_pos:.2f} valley@{valley_pos:.2f}")
|
||||
|
||||
# Find story boundaries by looking for sharp transitions
|
||||
print("\n\nSharp transitions (potential story/scene boundaries):")
|
||||
print("-" * 60)
|
||||
|
||||
# Use total gradient magnitude across all features
|
||||
gradients = np.diff(predictions, axis=0)
|
||||
gradient_magnitude = np.sqrt(np.sum(gradients ** 2, axis=1))
|
||||
|
||||
# Find top transition points
|
||||
top_transitions = np.argsort(gradient_magnitude)[-8:] # top 8 (roughly one per story)
|
||||
top_transitions = np.sort(top_transitions)
|
||||
|
||||
for idx in top_transitions:
|
||||
pos = positions[idx]
|
||||
# Estimate which chunk this corresponds to
|
||||
chunk_idx = int(pos * (len(chunks) - 1))
|
||||
chunk_preview = chunks[min(chunk_idx, len(chunks) - 1)][:80]
|
||||
print(f" Position {pos:.3f} (magnitude {gradient_magnitude[idx]:.4f})")
|
||||
print(f" Text: \"{chunk_preview}...\"")
|
||||
print()
|
||||
|
||||
# Compare dialogue-heavy vs narrative-heavy regions
|
||||
print("\nDialogue vs Narrative rhythm:")
|
||||
print("-" * 60)
|
||||
|
||||
dialogue_idx = FEATURE_NAMES.index("dialogue_ratio")
|
||||
sent_var_idx = FEATURE_NAMES.index("sentence_length_variance")
|
||||
short_idx = FEATURE_NAMES.index("short_sentence_ratio")
|
||||
|
||||
# Split into quartiles
|
||||
n = len(positions)
|
||||
for q, label in [(0, "Opening"), (1, "Early-mid"), (2, "Late-mid"), (3, "Closing")]:
|
||||
start = q * n // 4
|
||||
end = (q + 1) * n // 4
|
||||
avg_dialogue = np.mean(predictions_real[start:end, dialogue_idx])
|
||||
avg_variance = np.mean(predictions_real[start:end, sent_var_idx])
|
||||
avg_short = np.mean(predictions_real[start:end, short_idx])
|
||||
print(f" {label:12s} dialogue={avg_dialogue:.4f} "
|
||||
f"sent_variance={avg_variance:.4f} short_ratio={avg_short:.4f}")
|
||||
|
||||
# Interpolation test — what does the field predict BETWEEN chunks?
|
||||
print("\n\nInterpolation test (querying between training points):")
|
||||
print("-" * 60)
|
||||
print("The field predicts style features at positions where no text exists.")
|
||||
print("If interpolation is smooth and sensible, the field learned structure.")
|
||||
print("If it's noisy/random, it just memorised individual chunks.")
|
||||
|
||||
# Check smoothness: average absolute second derivative
|
||||
second_deriv = np.diff(predictions, n=2, axis=0)
|
||||
smoothness = np.mean(np.abs(second_deriv))
|
||||
print(f"\n Smoothness score (lower = smoother): {smoothness:.6f}")
|
||||
|
||||
if smoothness < 0.01:
|
||||
print(" → Very smooth field — learned continuous style patterns")
|
||||
elif smoothness < 0.05:
|
||||
print(" → Moderately smooth — some structure learned")
|
||||
else:
|
||||
print(" → Rough field — mostly memorised chunks")
|
||||
|
||||
return predictions_real
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# 7. Save results for later
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def save_results(positions, predictions_real, output_path):
|
||||
"""Save the field data as JSON for potential visualisation later."""
|
||||
results = {
|
||||
"positions": positions.tolist(),
|
||||
"features": {
|
||||
name: predictions_real[:, i].tolist()
|
||||
for i, name in enumerate(FEATURE_NAMES)
|
||||
},
|
||||
"feature_names": FEATURE_NAMES,
|
||||
"description": "WoRF continuous field over Wodehouse's 'My Man Jeeves'",
|
||||
}
|
||||
Path(output_path).write_text(json.dumps(results, indent=2))
|
||||
print(f"\nField data saved to {output_path}")
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Main
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def main():
|
||||
book_path = Path(__file__).parent / "pg-wood.txt"
|
||||
|
||||
print("WoRF — Word Radiance Field Experiment")
|
||||
print("=" * 60)
|
||||
print(f"Source: {book_path.name}")
|
||||
|
||||
# Load and split
|
||||
text = load_and_clean(str(book_path))
|
||||
print(f"Clean text: {len(text):,} characters, {len(text.split()):,} words")
|
||||
|
||||
chunks = split_into_chunks(text, chunk_size=300)
|
||||
print(f"Chunks: {len(chunks)} (≈300 words each)")
|
||||
|
||||
# Prepare training data
|
||||
positions, features, norm_stats = prepare_data(chunks)
|
||||
print(f"Feature dimensions: {NUM_FEATURES}")
|
||||
print(f"Features: {', '.join(FEATURE_NAMES)}")
|
||||
|
||||
# Train
|
||||
model, num_freq = train_worf(positions, features, epochs=3000)
|
||||
|
||||
# Query the continuous field
|
||||
query_positions, predictions = query_field(model, num_freq, num_points=1000)
|
||||
|
||||
# Analyse
|
||||
predictions_real = analyse_field(query_positions, predictions, norm_stats, chunks)
|
||||
|
||||
# Save
|
||||
save_results(query_positions, predictions_real,
|
||||
str(book_path.parent / "worf-field-jeeves.json"))
|
||||
|
||||
print("\n" + "=" * 60)
|
||||
print("Done. The field exists. Poke it and see what it tells you.")
|
||||
print("=" * 60)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
474
experiments/worf/v2_relational.py
Normal file
474
experiments/worf/v2_relational.py
Normal file
|
|
@ -0,0 +1,474 @@
|
|||
#!/usr/bin/env python3
|
||||
"""
|
||||
WoRF v2 — Word Radiance Field (Feature-Space)
|
||||
===============================================
|
||||
|
||||
v1 used position-in-book as coordinates → just memorised chunks.
|
||||
v2 uses the style features themselves as the coordinate system.
|
||||
|
||||
The field learns relationships BETWEEN style dimensions:
|
||||
"When dialogue is high and sentences are short, what happens to
|
||||
vocabulary richness and aside density?"
|
||||
|
||||
That's the relational wording data — not what words mean,
|
||||
but how they behave together. The stuff a language pack needs.
|
||||
|
||||
Multi-book ready: each book is more "photos" of the same style field.
|
||||
"""
|
||||
|
||||
import re
|
||||
import math
|
||||
import json
|
||||
import torch
|
||||
import torch.nn as nn
|
||||
import numpy as np
|
||||
from pathlib import Path
|
||||
from collections import Counter
|
||||
from dataclasses import dataclass
|
||||
|
||||
WORF_DIR = Path(__file__).parent
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# 1. Feature Extraction (same as v1, proven to work)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
FEATURE_NAMES = [
|
||||
"avg_word_length",
|
||||
"avg_sentence_length",
|
||||
"sentence_length_variance",
|
||||
"dialogue_ratio",
|
||||
"vocabulary_richness",
|
||||
"dash_density",
|
||||
"exclamation_density",
|
||||
"question_density",
|
||||
"short_sentence_ratio",
|
||||
"aside_density",
|
||||
"avg_punct_per_sentence",
|
||||
]
|
||||
NUM_FEATURES = len(FEATURE_NAMES)
|
||||
|
||||
|
||||
def extract_features(chunk: str) -> list[float]:
|
||||
"""Extract stylistic features from a chunk of text."""
|
||||
words = chunk.split()
|
||||
sentences = re.split(r'[.!?]+', chunk)
|
||||
sentences = [s.strip() for s in sentences if s.strip()]
|
||||
|
||||
word_lengths = [len(w.strip(".,;:!?\"'()—-")) for w in words]
|
||||
word_lengths = [wl for wl in word_lengths if wl > 0]
|
||||
|
||||
dialogue_chars = sum(1 for c in chunk if c == '"')
|
||||
total_chars = len(chunk) or 1
|
||||
|
||||
dashes = chunk.count("—") + chunk.count("--")
|
||||
exclamations = chunk.count("!")
|
||||
questions = chunk.count("?")
|
||||
commas = chunk.count(",")
|
||||
|
||||
unique_words = len(set(w.lower().strip(".,;:!?\"'()—-") for w in words))
|
||||
total_words = len(words) or 1
|
||||
|
||||
sent_lengths = [len(s.split()) for s in sentences]
|
||||
sent_mean = float(np.mean(sent_lengths)) if sent_lengths else 0.0
|
||||
sent_std = float(np.std(sent_lengths)) if sent_lengths else 0.0
|
||||
|
||||
short_sentences = sum(1 for sl in sent_lengths if sl <= 5)
|
||||
short_ratio = short_sentences / (len(sent_lengths) or 1)
|
||||
|
||||
aside_density = (commas + dashes) / total_words
|
||||
|
||||
return [
|
||||
float(np.mean(word_lengths)) if word_lengths else 0.0,
|
||||
sent_mean,
|
||||
sent_std,
|
||||
dialogue_chars / total_chars,
|
||||
unique_words / total_words,
|
||||
dashes / total_words,
|
||||
exclamations / total_words,
|
||||
questions / total_words,
|
||||
short_ratio,
|
||||
aside_density,
|
||||
(commas + dashes + exclamations + questions) / (len(sent_lengths) or 1),
|
||||
]
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# 2. Text Loading (multi-book ready)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@dataclass
|
||||
class BookChunk:
|
||||
text: str
|
||||
features: list[float]
|
||||
book: str
|
||||
chunk_idx: int
|
||||
position: float # 0-1 position within book
|
||||
|
||||
|
||||
def load_gutenberg(path: str, title: str) -> list[BookChunk]:
|
||||
"""Load a Gutenberg text, split into chunks, extract features."""
|
||||
text = Path(path).read_text(encoding="utf-8")
|
||||
|
||||
# Strip PG header/footer
|
||||
for marker in ["*** START OF THE PROJECT GUTENBERG EBOOK",
|
||||
"*** START OF THIS PROJECT GUTENBERG EBOOK"]:
|
||||
idx = text.find(marker)
|
||||
if idx != -1:
|
||||
text = text[text.find("\n", idx) + 1:]
|
||||
break
|
||||
|
||||
end = text.find("*** END OF THE PROJECT GUTENBERG")
|
||||
if end != -1:
|
||||
text = text[:end]
|
||||
|
||||
text = text.strip()
|
||||
words = text.split()
|
||||
chunk_size = 300
|
||||
chunks = []
|
||||
|
||||
for i in range(0, len(words), chunk_size):
|
||||
chunk_text = " ".join(words[i:i + chunk_size])
|
||||
if len(chunk_text.split()) > 50:
|
||||
chunks.append(chunk_text)
|
||||
|
||||
results = []
|
||||
for i, chunk_text in enumerate(chunks):
|
||||
results.append(BookChunk(
|
||||
text=chunk_text,
|
||||
features=extract_features(chunk_text),
|
||||
book=title,
|
||||
chunk_idx=i,
|
||||
position=i / max(len(chunks) - 1, 1),
|
||||
))
|
||||
|
||||
print(f" {title}: {len(results)} chunks from {len(words):,} words")
|
||||
return results
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# 3. WoRF v2 — Masked Feature Prediction
|
||||
# ---------------------------------------------------------------------------
|
||||
#
|
||||
# Instead of position → features, we do:
|
||||
# features_with_one_masked → predict_all_features
|
||||
#
|
||||
# This learns the RELATIONSHIPS between style dimensions.
|
||||
# Like a denoising autoencoder where each mask reveals a different
|
||||
# relationship. Like NeRF views — each masking angle shows a different
|
||||
# aspect of the same underlying field.
|
||||
|
||||
def positional_encoding(x: torch.Tensor, num_frequencies: int = 6) -> torch.Tensor:
|
||||
"""Sinusoidal encoding for continuous feature values."""
|
||||
encodings = [x]
|
||||
for freq in range(num_frequencies):
|
||||
encodings.append(torch.sin(2.0 ** freq * math.pi * x))
|
||||
encodings.append(torch.cos(2.0 ** freq * math.pi * x))
|
||||
return torch.cat(encodings, dim=-1)
|
||||
|
||||
|
||||
class WoRFv2(nn.Module):
|
||||
"""Word Radiance Field v2 — learns inter-feature relationships.
|
||||
|
||||
Input: N features (one zeroed out) + mask indicator per feature
|
||||
Output: predicted values for all features
|
||||
|
||||
The network learns: given these style characteristics,
|
||||
what must the missing one be? That's the relational field.
|
||||
"""
|
||||
|
||||
def __init__(self, num_features: int, num_frequencies: int = 6,
|
||||
hidden_dim: int = 256, num_layers: int = 6):
|
||||
super().__init__()
|
||||
|
||||
self.num_features = num_features
|
||||
self.num_frequencies = num_frequencies
|
||||
per_feature_dim = 1 + 2 * num_frequencies # encoded value
|
||||
input_dim = num_features * (per_feature_dim + 1) # +1 for mask flag
|
||||
|
||||
layers = []
|
||||
layers.append(nn.Linear(input_dim, hidden_dim))
|
||||
layers.append(nn.GELU())
|
||||
|
||||
for i in range(num_layers - 2):
|
||||
layers.append(nn.Linear(hidden_dim, hidden_dim))
|
||||
layers.append(nn.GELU())
|
||||
if i == num_layers // 2 - 2:
|
||||
layers.append(nn.Dropout(0.05))
|
||||
|
||||
layers.append(nn.Linear(hidden_dim, num_features))
|
||||
|
||||
self.network = nn.Sequential(*layers)
|
||||
|
||||
def encode_input(self, features: torch.Tensor, mask_idx: torch.Tensor) -> torch.Tensor:
|
||||
"""Encode features with positional encoding + mask flags."""
|
||||
encoded_parts = []
|
||||
|
||||
for f in range(self.num_features):
|
||||
feat_val = features[:, f:f+1]
|
||||
feat_encoded = positional_encoding(feat_val, self.num_frequencies)
|
||||
|
||||
is_masked = (mask_idx == f).float().unsqueeze(-1)
|
||||
feat_encoded = feat_encoded * (1.0 - is_masked)
|
||||
|
||||
feat_with_mask = torch.cat([feat_encoded, is_masked], dim=-1)
|
||||
encoded_parts.append(feat_with_mask)
|
||||
|
||||
return torch.cat(encoded_parts, dim=-1)
|
||||
|
||||
def forward(self, features: torch.Tensor, mask_idx: torch.Tensor) -> torch.Tensor:
|
||||
encoded = self.encode_input(features, mask_idx)
|
||||
return self.network(encoded)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# 4. Training
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def train_worf_v2(chunks: list[BookChunk], epochs: int = 3000, lr: float = 5e-4):
|
||||
"""Train WoRF v2 with random feature masking."""
|
||||
features = torch.tensor([c.features for c in chunks], dtype=torch.float32)
|
||||
|
||||
feat_min = features.min(dim=0).values
|
||||
feat_max = features.max(dim=0).values
|
||||
feat_range = feat_max - feat_min
|
||||
feat_range[feat_range == 0] = 1.0
|
||||
features_norm = (features - feat_min) / feat_range
|
||||
norm_stats = {"min": feat_min, "max": feat_max, "range": feat_range}
|
||||
|
||||
model = WoRFv2(num_features=NUM_FEATURES)
|
||||
optimiser = torch.optim.AdamW(model.parameters(), lr=lr, weight_decay=1e-4)
|
||||
scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimiser, T_max=epochs)
|
||||
loss_fn = nn.MSELoss()
|
||||
|
||||
n_chunks = len(chunks)
|
||||
print(f"\nTraining WoRF v2: {n_chunks} chunks, {NUM_FEATURES} features")
|
||||
print(f"Architecture: masked feature prediction (like masked autoencoder)")
|
||||
print("-" * 60)
|
||||
|
||||
best_loss = float("inf")
|
||||
|
||||
for epoch in range(epochs):
|
||||
mask_idx = torch.randint(0, NUM_FEATURES, (n_chunks,))
|
||||
|
||||
pred = model(features_norm, mask_idx)
|
||||
loss = loss_fn(pred, features_norm)
|
||||
|
||||
optimiser.zero_grad()
|
||||
loss.backward()
|
||||
optimiser.step()
|
||||
scheduler.step()
|
||||
|
||||
if loss.item() < best_loss:
|
||||
best_loss = loss.item()
|
||||
|
||||
if epoch % 300 == 0 or epoch == epochs - 1:
|
||||
print(f" Epoch {epoch:4d}/{epochs} Loss: {loss.item():.6f} "
|
||||
f"Best: {best_loss:.6f} LR: {scheduler.get_last_lr()[0]:.6f}")
|
||||
|
||||
return model, features_norm, norm_stats
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# 5. Analysis
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def probe_relationships(model: WoRFv2, features_norm: torch.Tensor, norm_stats: dict):
|
||||
"""Probe what the field learned about feature relationships."""
|
||||
print("\n" + "=" * 60)
|
||||
print("RELATIONAL FIELD ANALYSIS")
|
||||
print("=" * 60)
|
||||
|
||||
model.eval()
|
||||
|
||||
# --- Test 1: Feature predictability ---
|
||||
print("\nFeature predictability (lower error = stronger relationship to others):")
|
||||
print("-" * 60)
|
||||
|
||||
feature_errors = {}
|
||||
with torch.no_grad():
|
||||
for f in range(NUM_FEATURES):
|
||||
mask_idx = torch.full((len(features_norm),), f, dtype=torch.long)
|
||||
pred = model(features_norm, mask_idx)
|
||||
error = torch.mean((pred[:, f] - features_norm[:, f]) ** 2).item()
|
||||
feature_errors[FEATURE_NAMES[f]] = error
|
||||
|
||||
sorted_features = sorted(feature_errors.items(), key=lambda x: x[1])
|
||||
for name, error in sorted_features:
|
||||
bar_len = int((1 - min(error * 20, 1)) * 40)
|
||||
bar = "#" * bar_len
|
||||
predictability = "highly relational" if error < 0.01 else \
|
||||
"moderately relational" if error < 0.05 else "independent"
|
||||
print(f" {name:30s} error={error:.5f} [{bar:40s}] {predictability}")
|
||||
|
||||
# --- Test 2: Feature influence matrix ---
|
||||
print("\n\nFeature influence matrix:")
|
||||
print("(When feature X increases, what happens to feature Y?)")
|
||||
print("-" * 60)
|
||||
|
||||
influence_matrix = np.zeros((NUM_FEATURES, NUM_FEATURES))
|
||||
|
||||
with torch.no_grad():
|
||||
baseline = features_norm.mean(dim=0, keepdim=True)
|
||||
|
||||
for source_f in range(NUM_FEATURES):
|
||||
high = baseline.clone()
|
||||
low = baseline.clone()
|
||||
high[0, source_f] = 0.9
|
||||
low[0, source_f] = 0.1
|
||||
|
||||
for target_f in range(NUM_FEATURES):
|
||||
if target_f == source_f:
|
||||
continue
|
||||
mask = torch.tensor([target_f])
|
||||
pred_high = model(high, mask)[0, target_f].item()
|
||||
pred_low = model(low, mask)[0, target_f].item()
|
||||
influence_matrix[source_f, target_f] = pred_high - pred_low
|
||||
|
||||
# Print matrix
|
||||
short_names = [n[:8] for n in FEATURE_NAMES]
|
||||
print(f"\n {'':30s}", end="")
|
||||
for sn in short_names:
|
||||
print(f" {sn:>8s}", end="")
|
||||
print()
|
||||
|
||||
for i, name in enumerate(FEATURE_NAMES):
|
||||
print(f" {name:30s}", end="")
|
||||
for j in range(NUM_FEATURES):
|
||||
val = influence_matrix[i, j]
|
||||
if i == j:
|
||||
print(f" ---", end="")
|
||||
elif abs(val) > 0.15:
|
||||
print(f" {val:+.2f}*", end="")
|
||||
else:
|
||||
print(f" {val:+.3f}", end="")
|
||||
print()
|
||||
|
||||
# --- Test 3: Style interpolation ---
|
||||
print("\n\nStyle interpolation (walking through the field):")
|
||||
print("-" * 60)
|
||||
print("Interpolating between 'narrative exposition' and 'snappy dialogue':\n")
|
||||
|
||||
with torch.no_grad():
|
||||
narrative = baseline.clone()
|
||||
narrative[0, FEATURE_NAMES.index("dialogue_ratio")] = 0.05
|
||||
narrative[0, FEATURE_NAMES.index("avg_sentence_length")] = 0.8
|
||||
narrative[0, FEATURE_NAMES.index("short_sentence_ratio")] = 0.1
|
||||
narrative[0, FEATURE_NAMES.index("vocabulary_richness")] = 0.8
|
||||
|
||||
dialogue = baseline.clone()
|
||||
dialogue[0, FEATURE_NAMES.index("dialogue_ratio")] = 0.9
|
||||
dialogue[0, FEATURE_NAMES.index("avg_sentence_length")] = 0.2
|
||||
dialogue[0, FEATURE_NAMES.index("short_sentence_ratio")] = 0.8
|
||||
dialogue[0, FEATURE_NAMES.index("vocabulary_richness")] = 0.4
|
||||
|
||||
predict_features = [
|
||||
FEATURE_NAMES.index("exclamation_density"),
|
||||
FEATURE_NAMES.index("question_density"),
|
||||
FEATURE_NAMES.index("dash_density"),
|
||||
FEATURE_NAMES.index("aside_density"),
|
||||
FEATURE_NAMES.index("avg_punct_per_sentence"),
|
||||
]
|
||||
|
||||
print(f" {'blend':>5s}", end="")
|
||||
for name in ["excl_dens", "quest_dens", "dash_dens", "aside_dens", "punct/sent"]:
|
||||
print(f" {name:>10s}", end="")
|
||||
print()
|
||||
print(f" {'':>5s}{'':->55s}")
|
||||
|
||||
for alpha in np.linspace(0, 1, 11):
|
||||
blended = narrative * (1 - alpha) + dialogue * alpha
|
||||
predictions = []
|
||||
for pf in predict_features:
|
||||
mask = torch.tensor([pf])
|
||||
pred = model(blended, mask)[0, pf].item()
|
||||
pred_real = pred * norm_stats["range"][pf].item() + norm_stats["min"][pf].item()
|
||||
predictions.append(pred_real)
|
||||
|
||||
label = "narr" if alpha < 0.3 else "dial" if alpha > 0.7 else "mix"
|
||||
print(f" {alpha:4.1f}{label:1s}", end="")
|
||||
for p in predictions:
|
||||
print(f" {p:10.4f}", end="")
|
||||
print()
|
||||
|
||||
# --- Test 4: Reconstruction accuracy ---
|
||||
print("\n\nReconstruction accuracy per feature:")
|
||||
print("-" * 60)
|
||||
|
||||
with torch.no_grad():
|
||||
total_error = 0
|
||||
total_count = 0
|
||||
|
||||
for f in range(NUM_FEATURES):
|
||||
mask = torch.full((len(features_norm),), f, dtype=torch.long)
|
||||
pred = model(features_norm, mask)
|
||||
errors = (pred[:, f] - features_norm[:, f]) ** 2
|
||||
rmse_real = math.sqrt(errors.mean().item()) * norm_stats["range"][f].item()
|
||||
total_error += errors.sum().item()
|
||||
total_count += len(errors)
|
||||
print(f" {FEATURE_NAMES[f]:30s} RMSE (real units): {rmse_real:.4f}")
|
||||
|
||||
avg_error = total_error / total_count
|
||||
print(f"\n Overall MSE: {avg_error:.6f}")
|
||||
print(f" Overall RMSE: {math.sqrt(avg_error):.4f}")
|
||||
|
||||
return influence_matrix
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# 6. Save
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def save_results(influence_matrix: np.ndarray, output_path: str):
|
||||
"""Save the influence matrix and metadata."""
|
||||
results = {
|
||||
"feature_names": FEATURE_NAMES,
|
||||
"influence_matrix": influence_matrix.tolist(),
|
||||
"description": "WoRF v2: inter-feature influence matrix from masked prediction",
|
||||
"interpretation": "influence_matrix[i][j] = when feature i goes high, "
|
||||
"how much does the predicted value of feature j change",
|
||||
}
|
||||
Path(output_path).write_text(json.dumps(results, indent=2))
|
||||
print(f"\nResults saved to {output_path}")
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Main
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def main():
|
||||
print("WoRF v2 — Word Radiance Field (Relational)")
|
||||
print("=" * 60)
|
||||
|
||||
all_chunks = []
|
||||
|
||||
book_path = WORF_DIR / "pg-wood.txt"
|
||||
if book_path.exists():
|
||||
all_chunks.extend(load_gutenberg(str(book_path), "My Man Jeeves"))
|
||||
|
||||
# Add more books here:
|
||||
# all_chunks.extend(load_gutenberg("pg-wilde.txt", "Importance of Being Earnest"))
|
||||
# all_chunks.extend(load_gutenberg("pg-austen.txt", "Pride and Prejudice"))
|
||||
|
||||
if not all_chunks:
|
||||
print("No books found!")
|
||||
return
|
||||
|
||||
books = set(c.book for c in all_chunks)
|
||||
print(f"\nTotal: {len(all_chunks)} chunks from {len(books)} book(s)")
|
||||
|
||||
model, features_norm, norm_stats = train_worf_v2(all_chunks, epochs=4000)
|
||||
|
||||
influence_matrix = probe_relationships(model, features_norm, norm_stats)
|
||||
|
||||
save_results(influence_matrix, str(WORF_DIR / "worf-v2-relations.json"))
|
||||
|
||||
print("\n" + "=" * 60)
|
||||
print("The relational field exists.")
|
||||
print("This is what Wodehouse's English 'feels like' in feature space.")
|
||||
print("Add more books to build toward an EN-GB WoRF language pack.")
|
||||
print("=" * 60)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
12042
experiments/worf/worf-field-jeeves.json
Normal file
12042
experiments/worf/worf-field-jeeves.json
Normal file
File diff suppressed because it is too large
Load diff
162
experiments/worf/worf-v2-relations.json
Normal file
162
experiments/worf/worf-v2-relations.json
Normal file
|
|
@ -0,0 +1,162 @@
|
|||
{
|
||||
"feature_names": [
|
||||
"avg_word_length",
|
||||
"avg_sentence_length",
|
||||
"sentence_length_variance",
|
||||
"dialogue_ratio",
|
||||
"vocabulary_richness",
|
||||
"dash_density",
|
||||
"exclamation_density",
|
||||
"question_density",
|
||||
"short_sentence_ratio",
|
||||
"aside_density",
|
||||
"avg_punct_per_sentence"
|
||||
],
|
||||
"influence_matrix": [
|
||||
[
|
||||
0.0,
|
||||
-0.03247326612472534,
|
||||
-0.0239107608795166,
|
||||
-0.00048324093222618103,
|
||||
0.1107892394065857,
|
||||
0.015222892165184021,
|
||||
-0.024353697896003723,
|
||||
0.02327282726764679,
|
||||
0.055540263652801514,
|
||||
0.04952073097229004,
|
||||
-0.018031805753707886
|
||||
],
|
||||
[
|
||||
-0.11262395977973938,
|
||||
0.0,
|
||||
0.1966363489627838,
|
||||
0.0003904178738594055,
|
||||
-0.02297872304916382,
|
||||
-0.068694107234478,
|
||||
-0.12937799841165543,
|
||||
-0.19205902516841888,
|
||||
-0.29318100214004517,
|
||||
-0.09364050626754761,
|
||||
0.21115505695343018
|
||||
],
|
||||
[
|
||||
0.005609989166259766,
|
||||
0.13626961410045624,
|
||||
0.0,
|
||||
-0.0007154941558837891,
|
||||
-0.02271491289138794,
|
||||
0.005668185651302338,
|
||||
-0.0020959973335266113,
|
||||
-0.01791289448738098,
|
||||
0.04299241304397583,
|
||||
0.03149789571762085,
|
||||
0.153947114944458
|
||||
],
|
||||
[
|
||||
-0.01625087857246399,
|
||||
0.012996375560760498,
|
||||
0.004404813051223755,
|
||||
0.0,
|
||||
-0.004828751087188721,
|
||||
-0.010406054556369781,
|
||||
0.012377187609672546,
|
||||
-0.007560417056083679,
|
||||
0.017317771911621094,
|
||||
-0.006858497858047485,
|
||||
0.013844549655914307
|
||||
],
|
||||
[
|
||||
0.05449041724205017,
|
||||
-0.002728700637817383,
|
||||
0.03543153405189514,
|
||||
-0.0007495768368244171,
|
||||
0.0,
|
||||
0.02357766404747963,
|
||||
-0.06922292709350586,
|
||||
-0.01401202380657196,
|
||||
0.03409099578857422,
|
||||
-0.022808074951171875,
|
||||
-0.06983467936515808
|
||||
],
|
||||
[
|
||||
0.05502724647521973,
|
||||
-0.028156444430351257,
|
||||
0.016653388738632202,
|
||||
-0.0004658550024032593,
|
||||
0.008968591690063477,
|
||||
0.0,
|
||||
0.07332807779312134,
|
||||
0.004690051078796387,
|
||||
0.004198431968688965,
|
||||
0.1471288800239563,
|
||||
0.1343848705291748
|
||||
],
|
||||
[
|
||||
-0.008408337831497192,
|
||||
-0.03403817117214203,
|
||||
-0.03511646389961243,
|
||||
0.0002146884799003601,
|
||||
0.01336967945098877,
|
||||
0.012008734047412872,
|
||||
0.0,
|
||||
-0.038716867566108704,
|
||||
0.01683211326599121,
|
||||
0.015300273895263672,
|
||||
0.038202375173568726
|
||||
],
|
||||
[
|
||||
-0.04866918921470642,
|
||||
-0.09030131995677948,
|
||||
-0.08065217733383179,
|
||||
0.0006130747497081757,
|
||||
-0.04372537136077881,
|
||||
0.035463668406009674,
|
||||
0.020850971341133118,
|
||||
0.0,
|
||||
0.06807422637939453,
|
||||
0.04871469736099243,
|
||||
0.015091657638549805
|
||||
],
|
||||
[
|
||||
0.07264012098312378,
|
||||
-0.17126457393169403,
|
||||
0.007805615663528442,
|
||||
0.0005212798714637756,
|
||||
-0.07545053958892822,
|
||||
-0.011027880012989044,
|
||||
0.16361884027719498,
|
||||
0.1303078681230545,
|
||||
0.0,
|
||||
0.08242395520210266,
|
||||
-0.042179644107818604
|
||||
],
|
||||
[
|
||||
0.05252787470817566,
|
||||
-0.06419773399829865,
|
||||
0.006353020668029785,
|
||||
-0.0005619712173938751,
|
||||
-0.03329026699066162,
|
||||
0.04053857922554016,
|
||||
0.05099382996559143,
|
||||
0.0370599627494812,
|
||||
0.05590474605560303,
|
||||
0.0,
|
||||
0.22894394397735596
|
||||
],
|
||||
[
|
||||
-0.011781513690948486,
|
||||
0.0985381007194519,
|
||||
0.09538811445236206,
|
||||
-0.00027120113372802734,
|
||||
-0.0469667911529541,
|
||||
0.04663299024105072,
|
||||
0.04154162108898163,
|
||||
0.0520768016576767,
|
||||
-0.12925076484680176,
|
||||
0.32439711689949036,
|
||||
0.0
|
||||
]
|
||||
],
|
||||
"description": "WoRF v2: inter-feature influence matrix from masked prediction",
|
||||
"interpretation": "influence_matrix[i][j] = when feature i goes high, how much does the predicted value of feature j change"
|
||||
}
|
||||
Loading…
Add table
Reference in a new issue