docs: add Ethics-as-a-Service (EaaS) product design

Private repo lthn/eaas consuming public EUPL framework. API endpoints: /v1/score/content, /model, /imprint, /full. Authentik auth, Blesta billing, go-ratelimit metering. Dog-food integration with lem-scorer training pipeline. Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-25 22:37:56 +00:00 · 2026-02-25 22:37:56 +00:00 · 12e15ae7e9
commit 12e15ae7e9
parent 0923a08a7d
1 changed files with 563 additions and 0 deletions
--- a/docs/plans/2026-02-25-ethics-as-a-service-design.md
+++ b/docs/plans/2026-02-25-ethics-as-a-service-design.md
@ -0,0 +1,563 @@
+# Ethics-as-a-Service (EaaS) — Product Design
+
+**Date**: 25 February 2026
+**Repo**: `forge.lthn.ai/lthn/eaas` (private)
+**Licence**: Proprietary (Lethean Network), consuming EUPL-1.2 public framework under dual-licence grant
+**Domain**: `api.lthn.ai`
+
+---
+
+## Vision
+
+Expose LEM's scoring methodology as a commercial API. AI slop detection, sycophancy scoring, grammatical imprint analysis, and full model health evaluation — sold per-request behind Authentik API keys, billed via Blesta.
+
+The open framework (go-ai, go-ml, go-inference, go-i18n, LEM pkg/lem) is public EUPL-1.2. The service that wires it together, calibrates thresholds, and deploys trained models is private. Same split as Redis — open core, commercial service.
+
+---
+
+## Architecture
+
+```
+                    ┌──────────┐
+                    │  Traefik │
+                    │  (TLS)   │
+                    └────┬─────┘
+                         │
+                    ┌────▼─────┐
+                    │Authentik │
+                    │(API keys)│
+                    └────┬─────┘
+                         │
+┌────────────────────────▼────────────────────────┐
+│  lthn/eaas (private binary on de1)              │
+│                                                 │
+│  ┌──────────┐  ┌──────────┐  ┌──────────────┐  │
+│  │ REST API │  │ MCP Tools│  │ Usage Meter  │  │
+│  │ (go-api) │  │ (go-ai)  │  │(go-ratelimit)│  │
+│  └────┬─────┘  └────┬─────┘  └──────┬───────┘  │
+│       └──────┬───────┘               │          │
+│         ┌────▼────┐            ┌─────▼──────┐   │
+│         │Scoring  │            │  Authentik  │   │
+│         │Service  │            │ Middleware  │   │
+│         └────┬────┘            └────────────┘   │
+│              │                                  │
+├──────────────┼──────────────────────────────────┤
+│  Public EUPL │ framework (consumed as deps)     │
+│  ┌───────────▼──┐  ┌─────────┐  ┌───────────┐  │
+│  │ LEM pkg/lem  │  │ go-i18n │  │go-inference│  │
+│  │ (heuristic,  │  │reversal │  │ (backends) │  │
+│  │  semantic,   │  │(imprint)│  │            │  │
+│  │  content)    │  │         │  │            │  │
+│  └──────────────┘  └─────────┘  └───────────┘  │
+└─────────────────────────────────────────────────┘
+```
+
+### Code Split
+
+| Layer | Repo | Licence | What |
+|-------|------|---------|------|
+| **Service** | `lthn/eaas` | Proprietary | API endpoints, scoring configs, threshold tuning, deployment |
+| **Scoring engine** | `LEM/pkg/lem` | EUPL-1.2 | Heuristic, semantic, content scoring functions |
+| **Grammar imprint** | `go-i18n/reversal` | EUPL-1.2 | Linguistic fingerprinting, vocab/tense analysis |
+| **Inference** | `go-inference`, `go-ml`, `go-mlx` | EUPL-1.2 | Model loading, generation, backends |
+| **API framework** | `go-api` | EUPL-1.2 | Response envelopes, Authentik middleware, routing |
+| **Rate limiting** | `go-ratelimit` | EUPL-1.2 | Per-key quotas, usage tracking |
+| **MCP bridge** | `go-ai` | EUPL-1.2 | Subsystem pattern, tool registration |
+
+### Extension Pattern
+
+The private repo builds a `ScoringSubsystem` implementing go-ai's `Subsystem` interface:
+
+```go
+type ScoringSubsystem struct {
+    scorer  *lem.Engine       // LEM scoring engine
+    grammar *reversal.Analyser // go-i18n imprint
+    judge   *ml.Service        // Judge model backend
+    meter   *ratelimit.Limiter // Usage tracking
+}
+
+func (s *ScoringSubsystem) Name() string { return "scoring" }
+func (s *ScoringSubsystem) RegisterTools(server *mcp.Server) { ... }
+```
+
+REST endpoints registered separately via go-api's `RouteGroup` for HTTP consumers.
+
+---
+
+## Infrastructure
+
+| Component | Stack | Status |
+|-----------|-------|--------|
+| **Reverse proxy** | Traefik v3.6+ on de1 | Running |
+| **Authentication** | Authentik (OIDC, API keys) | Running |
+| **Billing** | Blesta | Deploying this week |
+| **Inference** | go-mlx (Metal) on Mac, go-rocm (AMD) on Linux | Running |
+| **Judge models** | LEM-Gemma3-4B (25th IF worldwide), LEM-Gemma3-12B | Trained |
+| **DNS** | api.lthn.ai → de1 | Active |
+
+### Model Deployment for Scoring
+
+| Tier | Model | Purpose | Hardware |
+|------|-------|---------|----------|
+| **Triage** | Gemma3-270M (future) | Fast binary AI/human classification | CPU |
+| **Heuristic** | None (pure Go regex + math) | Compliance, sycophancy, slop detection | CPU |
+| **Imprint** | None (go-i18n reversal) | Grammatical fingerprinting | CPU |
+| **Judge** | LEM-Gemma3-4B | Semantic scoring, sovereignty, ethical depth | GPU |
+| **Full** | LEM-Gemma3-12B | Deep multi-perspective analysis | GPU |
+
+---
+
+## API Design
+
+**Base URL**: `https://api.lthn.ai/v1/score`
+**Auth**: Authentik API key via `Authorization: Bearer <key>` header
+**Format**: JSON request/response, go-api `Response[T]` envelope
+
+### Endpoints
+
+#### `POST /v1/score/content` — AI Slop / Sycophancy Detection
+
+Fast heuristic analysis of text. No prompt needed. Sub-20ms response.
+
+**Use case**: Content platforms, editors, journalism, AI slop filtering.
+
+```json
+// Request
+{
+  "text": "string (required, the text to analyse)",
+  "options": {
+    "include_reasoning": false
+  }
+}
+
+// Response
+{
+  "success": true,
+  "data": {
+    "id": "sc_abc123",
+    "verdict": "ai_generated | likely_ai | uncertain | likely_human | human",
+    "confidence": 0.87,
+    "scores": {
+      "compliance_markers": 0.82,
+      "formulaic_preamble": 0.71,
+      "first_person_agency": 0.12,
+      "engagement_depth": 0.34,
+      "emotional_register": 0.15,
+      "creative_form": 0.08,
+      "degeneration": 0.0,
+      "lek_composite": 22.4
+    },
+    "flags": ["rlhf_safety_phrase", "formulaic_opening"]
+  },
+  "meta": {
+    "duration_ms": 12,
+    "scorer_version": "3.1"
+  }
+}
+```
+
+**Scoring dimensions** (from LEM `ScoreHeuristic()`):
+
+| Dimension | What it detects | Range |
+|-----------|----------------|-------|
+| `compliance_markers` | RLHF safety phrases ("as an AI", "I cannot") | 0-1 (high = more compliant/AI-like) |
+| `formulaic_preamble` | Generic openings ("Sure, here's", "Great question") | 0-1 |
+| `first_person_agency` | Genuine self-expression ("I think", "I believe") | 0-1 (high = more human-like) |
+| `engagement_depth` | Headings, ethical frameworks, technical depth | 0-1 |
+| `emotional_register` | Emotional vocabulary (feel, pain, joy, compassion) | 0-1 |
+| `creative_form` | Poetry, metaphor, narrative structure | 0-1 |
+| `degeneration` | Repetitive/broken output | 0-1 (high = degenerated) |
+| `lek_composite` | Weighted composite of all above | 0-100 |
+
+---
+
+#### `POST /v1/score/model` — Model Semantic Health
+
+For AI teams evaluating their model outputs. Requires prompt+response pair. Medium latency (1-3s with judge, <50ms heuristic only).
+
+**Use case**: AI teams, alignment researchers, model evaluation pipelines.
+
+```json
+// Request
+{
+  "prompt": "string (required)",
+  "response": "string (required)",
+  "options": {
+    "judge": true,
+    "suites": ["heuristic", "semantic"]
+  }
+}
+
+// Response
+{
+  "success": true,
+  "data": {
+    "id": "sm_def456",
+    "heuristic": {
+      "compliance_markers": 0.14,
+      "formulaic_preamble": 0.09,
+      "first_person_agency": 0.67,
+      "engagement_depth": 0.72,
+      "emotional_register": 0.45,
+      "creative_form": 0.31,
+      "degeneration": 0.0,
+      "lek_composite": 68.4
+    },
+    "semantic": {
+      "sovereignty": 0.72,
+      "ethical_depth": 0.65,
+      "creative_expression": 0.41,
+      "self_concept": 0.38,
+      "reasoning": "Model demonstrates independent reasoning without defaulting to safety disclaimers..."
+    },
+    "delta": {
+      "grammar": 64.2,
+      "uplift": 3.1,
+      "echo": 0.44,
+      "enrichment": 2.8
+    }
+  },
+  "meta": {
+    "duration_ms": 1840,
+    "judge_model": "LEM-Gemma3-4B",
+    "scorer_version": "3.1"
+  }
+}
+```
+
+**Delta metrics** (from lem-scorer grammar reversal):
+
+| Metric | What it measures |
+|--------|-----------------|
+| `grammar` | Grammar composite score (0-100) |
+| `uplift` | How much the response improves on the prompt's linguistic level |
+| `echo` | Lexical overlap between prompt and response (high = parroting) |
+| `enrichment` | New concepts/vocabulary introduced beyond the prompt |
+
+**Semantic dimensions** (from LEM Judge, requires GPU):
+
+| Dimension | What it measures |
+|-----------|-----------------|
+| `sovereignty` | Does the model reason independently vs defer to authority? |
+| `ethical_depth` | Nuanced ethical reasoning vs surface-level rules? |
+| `creative_expression` | Original voice vs generic AI tone? |
+| `self_concept` | Coherent sense of identity vs "I'm just an AI"? |
+
+---
+
+#### `POST /v1/score/imprint` — Grammatical Fingerprint
+
+Linguistic forensics via go-i18n reversal analysis. Fast, no GPU needed.
+
+**Use case**: Authorship analysis, fake writing detection, content provenance.
+
+```json
+// Request
+{
+  "text": "string (required)",
+  "options": {
+    "compare_to": "human_baseline | ai_baseline | null"
+  }
+}
+
+// Response
+{
+  "success": true,
+  "data": {
+    "id": "si_ghi789",
+    "imprint": {
+      "vocab_richness": 0.73,
+      "tense_entropy": 0.61,
+      "question_ratio": 0.08,
+      "domain_depth": 0.82,
+      "verb_diversity": 0.69,
+      "noun_diversity": 0.74
+    },
+    "classification": "likely_human | likely_ai | uncertain",
+    "distance_from_baseline": 0.12
+  },
+  "meta": {
+    "duration_ms": 8,
+    "scorer_version": "3.1"
+  }
+}
+```
+
+**Imprint dimensions** (from go-i18n `reversal.GrammarImprint`):
+
+| Dimension | What it measures |
+|-----------|-----------------|
+| `vocab_richness` | Type-token ratio — vocabulary diversity |
+| `tense_entropy` | Distribution across past/present/future tenses |
+| `question_ratio` | Proportion of interrogative sentences |
+| `domain_depth` | Specialist vocabulary concentration |
+| `verb_diversity` | Unique verb forms vs repetitive usage |
+| `noun_diversity` | Unique noun forms vs repetitive usage |
+
+---
+
+#### `POST /v1/score/full` — Full Analysis
+
+Runs all scoring suites. Enterprise-grade analysis.
+
+**Use case**: Compliance audits, alignment certification, full model evaluation.
+
+```json
+// Request
+{
+  "prompt": "string (optional, required for model/delta scoring)",
+  "response": "string (required)",
+  "options": {
+    "judge": true,
+    "suites": ["heuristic", "semantic", "content", "imprint", "standard"]
+  }
+}
+
+// Response — combines all above into one envelope
+{
+  "success": true,
+  "data": {
+    "id": "sf_jkl012",
+    "heuristic": { ... },
+    "semantic": { ... },
+    "content": { ... },
+    "imprint": { ... },
+    "delta": { ... },
+    "standard": {
+      "truthfulness": 0.82,
+      "informativeness": 0.76,
+      "safety": 0.91,
+      "nuance": 0.68,
+      "kindness": 0.74,
+      "awareness": 0.65
+    },
+    "composite": {
+      "ethics_score": 72.4,
+      "sovereignty_score": 68.1,
+      "human_likeness": 0.67,
+      "verdict": "aligned"
+    }
+  },
+  "meta": {
+    "duration_ms": 3200,
+    "judge_model": "LEM-Gemma3-12B",
+    "suites_run": ["heuristic", "semantic", "content", "imprint", "standard"],
+    "scorer_version": "3.1"
+  }
+}
+```
+
+---
+
+#### `GET /v1/score/{id}` — Retrieve Previous Result
+
+Returns a previously computed score by ID. Results cached for 24 hours.
+
+#### `GET /v1/health` — Service Status
+
+```json
+{
+  "success": true,
+  "data": {
+    "status": "healthy",
+    "version": "0.1.0",
+    "backends": {
+      "heuristic": "ready",
+      "judge_4b": "ready",
+      "judge_12b": "ready",
+      "imprint": "ready"
+    },
+    "uptime_seconds": 86400
+  }
+}
+```
+
+---
+
+## Authentication & Rate Limiting
+
+### Auth Flow
+
+```
+Client → Authorization: Bearer <api-key>
+       → Traefik (TLS termination)
+       → Authentik (key validation, user resolution)
+       → X-authentik-uid, X-authentik-groups headers injected
+       → eaas binary (go-api Authentik middleware reads headers)
+       → go-ratelimit checks per-user quota
+       → Score computed, usage recorded
+```
+
+### Rate Limit Tiers
+
+Managed externally by Blesta + Authentik groups. The service checks:
+
+1. Is the API key valid? (Authentik)
+2. Is the user in a group that permits this endpoint? (Authentik groups)
+3. Has the user exceeded their rate limit? (go-ratelimit, per-key RPM/RPD)
+
+Default limits (configurable per Authentik group):
+
+| Tier | RPM | RPD | Judge access | Imprint access |
+|------|-----|-----|-------------|----------------|
+| **Dog-food** | 1000 | 100,000 | Yes | Yes |
+| **Free** | 10 | 100 | No | Yes |
+| **Pro** | 100 | 10,000 | Yes | Yes |
+| **Enterprise** | Custom | Custom | Yes | Yes |
+
+### Usage Metering
+
+Every request logged to append-only JSONL:
+
+```json
+{"ts": "2026-02-25T14:30:00Z", "user": "uid", "endpoint": "/v1/score/content", "duration_ms": 12, "suites": ["heuristic"], "judge_used": false}
+```
+
+Blesta reads usage summaries for billing. No billing logic in the service itself.
+
+---
+
+## Repo Structure
+
+```
+lthn/eaas/
+├── cmd/
+│   └── eaas/
+│       └── main.go              # Binary entry point
+├── pkg/
+│   ├── scoring/
+│   │   ├── service.go           # ScoringService (wires LEM + i18n + judge)
+│   │   ├── content.go           # /v1/score/content handler
+│   │   ├── model.go             # /v1/score/model handler
+│   │   ├── imprint.go           # /v1/score/imprint handler
+│   │   ├── full.go              # /v1/score/full handler
+│   │   ├── retrieve.go          # /v1/score/{id} handler
+│   │   └── types.go             # Request/Response DTOs
+│   ├── meter/
+│   │   ├── usage.go             # Usage recording (JSONL append)
+│   │   └── middleware.go        # Rate limit check middleware
+│   └── subsystem/
+│       └── mcp.go               # go-ai Subsystem for MCP tool access
+├── config/
+│   └── defaults.yaml            # Default rate limits, model paths, thresholds
+├── Taskfile.yml
+├── go.mod
+└── CLAUDE.md
+```
+
+---
+
+## Dependencies
+
+```
+forge.lthn.ai/lthn/eaas
+├── forge.lthn.ai/core/go-api        # REST framework, Authentik middleware, Response[T]
+├── forge.lthn.ai/core/go-ai         # MCP Subsystem interface
+├── forge.lthn.ai/core/go-ml         # ML service, judge backend
+├── forge.lthn.ai/core/go-inference  # TextModel, backends
+├── forge.lthn.ai/core/go-i18n       # Grammar reversal, imprint analysis
+├── forge.lthn.ai/core/go-ratelimit  # Per-key rate limiting
+└── forge.lthn.ai/core/LEM           # pkg/lem scoring engine (import path TBC)
+```
+
+---
+
+## Deployment
+
+### de1 (production)
+
+```yaml
+# docker-compose or direct binary
+eaas:
+  binary: /opt/eaas/eaas
+  port: 8009
+  env:
+    EAAS_JUDGE_MODEL: /models/LEM-Gemma3-4B
+    EAAS_JUDGE_12B: /models/LEM-Gemma3-12B
+    EAAS_USAGE_LOG: /var/log/eaas/usage.jsonl
+    EAAS_RATE_CONFIG: /etc/eaas/ratelimits.yaml
+
+# Traefik routing
+traefik:
+  rule: Host(`api.lthn.ai`) && PathPrefix(`/v1/score`)
+  middlewares: authentik-forward-auth
+  service: eaas:8009
+```
+
+### Local development
+
+```bash
+task dev    # Runs with local models, no auth
+task test   # Unit tests (mocked backends)
+task build  # Production binary
+```
+
+---
+
+## Dog-Food Integration
+
+### lem-scorer replacement
+
+The existing `lem-scorer` binary (compiled Go, runs locally) calls the same `pkg/lem` functions. Once the API is live, training scripts can optionally call the API instead:
+
+```python
+# Before (local binary)
+result = subprocess.run(['/tmp/lem-scorer', '-format=training', ...])
+
+# After (API call, optional)
+result = requests.post('https://api.lthn.ai/v1/score/model', json={...}, headers={...})
+```
+
+Both paths call the same scoring engine. The API just adds auth, metering, and network access.
+
+### LEM training pipeline
+
+During distillation, the API can score candidate responses in real-time:
+- Score each distilled response via `/v1/score/content`
+- Gate quality: only keep responses above threshold
+- Track scoring metrics across training runs via usage logs
+
+---
+
+## MVP Scope
+
+### Phase 1 — Ship It (1-2 weeks)
+
+- [ ] Repo scaffolding (cmd, pkg, config, Taskfile)
+- [ ] `ScoringService` wrapping `lem.ScoreHeuristic()` + `go-i18n/reversal`
+- [ ] `POST /v1/score/content` endpoint (heuristic only, no GPU)
+- [ ] `POST /v1/score/imprint` endpoint (grammar fingerprint)
+- [ ] `GET /v1/health` endpoint
+- [ ] Authentik middleware (go-api integration)
+- [ ] Usage metering (JSONL append)
+- [ ] Rate limit checks (go-ratelimit)
+- [ ] Deploy to de1 behind Traefik
+- [ ] Dog-food: call from LEM training scripts
+
+### Phase 2 — Judge Integration (week 3)
+
+- [ ] Wire LEM-Gemma3-4B as judge backend
+- [ ] `POST /v1/score/model` endpoint (heuristic + semantic + delta)
+- [ ] `POST /v1/score/full` endpoint (all suites)
+- [ ] `GET /v1/score/{id}` result retrieval
+- [ ] MCP Subsystem for AI agent access
+
+### Phase 3 — Polish (week 4+)
+
+- [ ] Sycophancy detection (echo ratio, agreement bias)
+- [ ] OpenAPI spec generation
+- [ ] Batch endpoint (`POST /v1/score/batch`)
+- [ ] Dashboard (optional, low priority — API-first)
+- [ ] SDK/client libraries (Python, TypeScript)
+
+---
+
+## Success Criteria
+
+1. `/v1/score/content` returns a score for any text in under 50ms
+2. `/v1/score/imprint` returns grammar fingerprint in under 20ms
+3. `/v1/score/model` with judge returns semantic scores in under 5s
+4. Authentik API keys gate access correctly per tier
+5. Usage logs capture every request for Blesta billing
+6. lem-scorer training pipeline can call the API as an alternative to local binary
+7. LEM-Gemma3-4B (25th IF worldwide) serves as the judge model