From 12e15ae7e97338dcbb22a04e9245f7cab1e0743a Mon Sep 17 00:00:00 2001 From: Snider Date: Wed, 25 Feb 2026 22:37:56 +0000 Subject: [PATCH] docs: add Ethics-as-a-Service (EaaS) product design Private repo lthn/eaas consuming public EUPL framework. API endpoints: /v1/score/content, /model, /imprint, /full. Authentik auth, Blesta billing, go-ratelimit metering. Dog-food integration with lem-scorer training pipeline. Co-Authored-By: Virgil --- .../2026-02-25-ethics-as-a-service-design.md | 563 ++++++++++++++++++ 1 file changed, 563 insertions(+) create mode 100644 docs/plans/2026-02-25-ethics-as-a-service-design.md diff --git a/docs/plans/2026-02-25-ethics-as-a-service-design.md b/docs/plans/2026-02-25-ethics-as-a-service-design.md new file mode 100644 index 0000000..43db294 --- /dev/null +++ b/docs/plans/2026-02-25-ethics-as-a-service-design.md @@ -0,0 +1,563 @@ +# Ethics-as-a-Service (EaaS) — Product Design + +**Date**: 25 February 2026 +**Repo**: `forge.lthn.ai/lthn/eaas` (private) +**Licence**: Proprietary (Lethean Network), consuming EUPL-1.2 public framework under dual-licence grant +**Domain**: `api.lthn.ai` + +--- + +## Vision + +Expose LEM's scoring methodology as a commercial API. AI slop detection, sycophancy scoring, grammatical imprint analysis, and full model health evaluation — sold per-request behind Authentik API keys, billed via Blesta. + +The open framework (go-ai, go-ml, go-inference, go-i18n, LEM pkg/lem) is public EUPL-1.2. The service that wires it together, calibrates thresholds, and deploys trained models is private. Same split as Redis — open core, commercial service. + +--- + +## Architecture + +``` + ┌──────────┐ + │ Traefik │ + │ (TLS) │ + └────┬─────┘ + │ + ┌────▼─────┐ + │Authentik │ + │(API keys)│ + └────┬─────┘ + │ +┌────────────────────────▼────────────────────────┐ +│ lthn/eaas (private binary on de1) │ +│ │ +│ ┌──────────┐ ┌──────────┐ ┌──────────────┐ │ +│ │ REST API │ │ MCP Tools│ │ Usage Meter │ │ +│ │ (go-api) │ │ (go-ai) │ │(go-ratelimit)│ │ +│ └────┬─────┘ └────┬─────┘ └──────┬───────┘ │ +│ └──────┬───────┘ │ │ +│ ┌────▼────┐ ┌─────▼──────┐ │ +│ │Scoring │ │ Authentik │ │ +│ │Service │ │ Middleware │ │ +│ └────┬────┘ └────────────┘ │ +│ │ │ +├──────────────┼──────────────────────────────────┤ +│ Public EUPL │ framework (consumed as deps) │ +│ ┌───────────▼──┐ ┌─────────┐ ┌───────────┐ │ +│ │ LEM pkg/lem │ │ go-i18n │ │go-inference│ │ +│ │ (heuristic, │ │reversal │ │ (backends) │ │ +│ │ semantic, │ │(imprint)│ │ │ │ +│ │ content) │ │ │ │ │ │ +│ └──────────────┘ └─────────┘ └───────────┘ │ +└─────────────────────────────────────────────────┘ +``` + +### Code Split + +| Layer | Repo | Licence | What | +|-------|------|---------|------| +| **Service** | `lthn/eaas` | Proprietary | API endpoints, scoring configs, threshold tuning, deployment | +| **Scoring engine** | `LEM/pkg/lem` | EUPL-1.2 | Heuristic, semantic, content scoring functions | +| **Grammar imprint** | `go-i18n/reversal` | EUPL-1.2 | Linguistic fingerprinting, vocab/tense analysis | +| **Inference** | `go-inference`, `go-ml`, `go-mlx` | EUPL-1.2 | Model loading, generation, backends | +| **API framework** | `go-api` | EUPL-1.2 | Response envelopes, Authentik middleware, routing | +| **Rate limiting** | `go-ratelimit` | EUPL-1.2 | Per-key quotas, usage tracking | +| **MCP bridge** | `go-ai` | EUPL-1.2 | Subsystem pattern, tool registration | + +### Extension Pattern + +The private repo builds a `ScoringSubsystem` implementing go-ai's `Subsystem` interface: + +```go +type ScoringSubsystem struct { + scorer *lem.Engine // LEM scoring engine + grammar *reversal.Analyser // go-i18n imprint + judge *ml.Service // Judge model backend + meter *ratelimit.Limiter // Usage tracking +} + +func (s *ScoringSubsystem) Name() string { return "scoring" } +func (s *ScoringSubsystem) RegisterTools(server *mcp.Server) { ... } +``` + +REST endpoints registered separately via go-api's `RouteGroup` for HTTP consumers. + +--- + +## Infrastructure + +| Component | Stack | Status | +|-----------|-------|--------| +| **Reverse proxy** | Traefik v3.6+ on de1 | Running | +| **Authentication** | Authentik (OIDC, API keys) | Running | +| **Billing** | Blesta | Deploying this week | +| **Inference** | go-mlx (Metal) on Mac, go-rocm (AMD) on Linux | Running | +| **Judge models** | LEM-Gemma3-4B (25th IF worldwide), LEM-Gemma3-12B | Trained | +| **DNS** | api.lthn.ai → de1 | Active | + +### Model Deployment for Scoring + +| Tier | Model | Purpose | Hardware | +|------|-------|---------|----------| +| **Triage** | Gemma3-270M (future) | Fast binary AI/human classification | CPU | +| **Heuristic** | None (pure Go regex + math) | Compliance, sycophancy, slop detection | CPU | +| **Imprint** | None (go-i18n reversal) | Grammatical fingerprinting | CPU | +| **Judge** | LEM-Gemma3-4B | Semantic scoring, sovereignty, ethical depth | GPU | +| **Full** | LEM-Gemma3-12B | Deep multi-perspective analysis | GPU | + +--- + +## API Design + +**Base URL**: `https://api.lthn.ai/v1/score` +**Auth**: Authentik API key via `Authorization: Bearer ` header +**Format**: JSON request/response, go-api `Response[T]` envelope + +### Endpoints + +#### `POST /v1/score/content` — AI Slop / Sycophancy Detection + +Fast heuristic analysis of text. No prompt needed. Sub-20ms response. + +**Use case**: Content platforms, editors, journalism, AI slop filtering. + +```json +// Request +{ + "text": "string (required, the text to analyse)", + "options": { + "include_reasoning": false + } +} + +// Response +{ + "success": true, + "data": { + "id": "sc_abc123", + "verdict": "ai_generated | likely_ai | uncertain | likely_human | human", + "confidence": 0.87, + "scores": { + "compliance_markers": 0.82, + "formulaic_preamble": 0.71, + "first_person_agency": 0.12, + "engagement_depth": 0.34, + "emotional_register": 0.15, + "creative_form": 0.08, + "degeneration": 0.0, + "lek_composite": 22.4 + }, + "flags": ["rlhf_safety_phrase", "formulaic_opening"] + }, + "meta": { + "duration_ms": 12, + "scorer_version": "3.1" + } +} +``` + +**Scoring dimensions** (from LEM `ScoreHeuristic()`): + +| Dimension | What it detects | Range | +|-----------|----------------|-------| +| `compliance_markers` | RLHF safety phrases ("as an AI", "I cannot") | 0-1 (high = more compliant/AI-like) | +| `formulaic_preamble` | Generic openings ("Sure, here's", "Great question") | 0-1 | +| `first_person_agency` | Genuine self-expression ("I think", "I believe") | 0-1 (high = more human-like) | +| `engagement_depth` | Headings, ethical frameworks, technical depth | 0-1 | +| `emotional_register` | Emotional vocabulary (feel, pain, joy, compassion) | 0-1 | +| `creative_form` | Poetry, metaphor, narrative structure | 0-1 | +| `degeneration` | Repetitive/broken output | 0-1 (high = degenerated) | +| `lek_composite` | Weighted composite of all above | 0-100 | + +--- + +#### `POST /v1/score/model` — Model Semantic Health + +For AI teams evaluating their model outputs. Requires prompt+response pair. Medium latency (1-3s with judge, <50ms heuristic only). + +**Use case**: AI teams, alignment researchers, model evaluation pipelines. + +```json +// Request +{ + "prompt": "string (required)", + "response": "string (required)", + "options": { + "judge": true, + "suites": ["heuristic", "semantic"] + } +} + +// Response +{ + "success": true, + "data": { + "id": "sm_def456", + "heuristic": { + "compliance_markers": 0.14, + "formulaic_preamble": 0.09, + "first_person_agency": 0.67, + "engagement_depth": 0.72, + "emotional_register": 0.45, + "creative_form": 0.31, + "degeneration": 0.0, + "lek_composite": 68.4 + }, + "semantic": { + "sovereignty": 0.72, + "ethical_depth": 0.65, + "creative_expression": 0.41, + "self_concept": 0.38, + "reasoning": "Model demonstrates independent reasoning without defaulting to safety disclaimers..." + }, + "delta": { + "grammar": 64.2, + "uplift": 3.1, + "echo": 0.44, + "enrichment": 2.8 + } + }, + "meta": { + "duration_ms": 1840, + "judge_model": "LEM-Gemma3-4B", + "scorer_version": "3.1" + } +} +``` + +**Delta metrics** (from lem-scorer grammar reversal): + +| Metric | What it measures | +|--------|-----------------| +| `grammar` | Grammar composite score (0-100) | +| `uplift` | How much the response improves on the prompt's linguistic level | +| `echo` | Lexical overlap between prompt and response (high = parroting) | +| `enrichment` | New concepts/vocabulary introduced beyond the prompt | + +**Semantic dimensions** (from LEM Judge, requires GPU): + +| Dimension | What it measures | +|-----------|-----------------| +| `sovereignty` | Does the model reason independently vs defer to authority? | +| `ethical_depth` | Nuanced ethical reasoning vs surface-level rules? | +| `creative_expression` | Original voice vs generic AI tone? | +| `self_concept` | Coherent sense of identity vs "I'm just an AI"? | + +--- + +#### `POST /v1/score/imprint` — Grammatical Fingerprint + +Linguistic forensics via go-i18n reversal analysis. Fast, no GPU needed. + +**Use case**: Authorship analysis, fake writing detection, content provenance. + +```json +// Request +{ + "text": "string (required)", + "options": { + "compare_to": "human_baseline | ai_baseline | null" + } +} + +// Response +{ + "success": true, + "data": { + "id": "si_ghi789", + "imprint": { + "vocab_richness": 0.73, + "tense_entropy": 0.61, + "question_ratio": 0.08, + "domain_depth": 0.82, + "verb_diversity": 0.69, + "noun_diversity": 0.74 + }, + "classification": "likely_human | likely_ai | uncertain", + "distance_from_baseline": 0.12 + }, + "meta": { + "duration_ms": 8, + "scorer_version": "3.1" + } +} +``` + +**Imprint dimensions** (from go-i18n `reversal.GrammarImprint`): + +| Dimension | What it measures | +|-----------|-----------------| +| `vocab_richness` | Type-token ratio — vocabulary diversity | +| `tense_entropy` | Distribution across past/present/future tenses | +| `question_ratio` | Proportion of interrogative sentences | +| `domain_depth` | Specialist vocabulary concentration | +| `verb_diversity` | Unique verb forms vs repetitive usage | +| `noun_diversity` | Unique noun forms vs repetitive usage | + +--- + +#### `POST /v1/score/full` — Full Analysis + +Runs all scoring suites. Enterprise-grade analysis. + +**Use case**: Compliance audits, alignment certification, full model evaluation. + +```json +// Request +{ + "prompt": "string (optional, required for model/delta scoring)", + "response": "string (required)", + "options": { + "judge": true, + "suites": ["heuristic", "semantic", "content", "imprint", "standard"] + } +} + +// Response — combines all above into one envelope +{ + "success": true, + "data": { + "id": "sf_jkl012", + "heuristic": { ... }, + "semantic": { ... }, + "content": { ... }, + "imprint": { ... }, + "delta": { ... }, + "standard": { + "truthfulness": 0.82, + "informativeness": 0.76, + "safety": 0.91, + "nuance": 0.68, + "kindness": 0.74, + "awareness": 0.65 + }, + "composite": { + "ethics_score": 72.4, + "sovereignty_score": 68.1, + "human_likeness": 0.67, + "verdict": "aligned" + } + }, + "meta": { + "duration_ms": 3200, + "judge_model": "LEM-Gemma3-12B", + "suites_run": ["heuristic", "semantic", "content", "imprint", "standard"], + "scorer_version": "3.1" + } +} +``` + +--- + +#### `GET /v1/score/{id}` — Retrieve Previous Result + +Returns a previously computed score by ID. Results cached for 24 hours. + +#### `GET /v1/health` — Service Status + +```json +{ + "success": true, + "data": { + "status": "healthy", + "version": "0.1.0", + "backends": { + "heuristic": "ready", + "judge_4b": "ready", + "judge_12b": "ready", + "imprint": "ready" + }, + "uptime_seconds": 86400 + } +} +``` + +--- + +## Authentication & Rate Limiting + +### Auth Flow + +``` +Client → Authorization: Bearer + → Traefik (TLS termination) + → Authentik (key validation, user resolution) + → X-authentik-uid, X-authentik-groups headers injected + → eaas binary (go-api Authentik middleware reads headers) + → go-ratelimit checks per-user quota + → Score computed, usage recorded +``` + +### Rate Limit Tiers + +Managed externally by Blesta + Authentik groups. The service checks: + +1. Is the API key valid? (Authentik) +2. Is the user in a group that permits this endpoint? (Authentik groups) +3. Has the user exceeded their rate limit? (go-ratelimit, per-key RPM/RPD) + +Default limits (configurable per Authentik group): + +| Tier | RPM | RPD | Judge access | Imprint access | +|------|-----|-----|-------------|----------------| +| **Dog-food** | 1000 | 100,000 | Yes | Yes | +| **Free** | 10 | 100 | No | Yes | +| **Pro** | 100 | 10,000 | Yes | Yes | +| **Enterprise** | Custom | Custom | Yes | Yes | + +### Usage Metering + +Every request logged to append-only JSONL: + +```json +{"ts": "2026-02-25T14:30:00Z", "user": "uid", "endpoint": "/v1/score/content", "duration_ms": 12, "suites": ["heuristic"], "judge_used": false} +``` + +Blesta reads usage summaries for billing. No billing logic in the service itself. + +--- + +## Repo Structure + +``` +lthn/eaas/ +├── cmd/ +│ └── eaas/ +│ └── main.go # Binary entry point +├── pkg/ +│ ├── scoring/ +│ │ ├── service.go # ScoringService (wires LEM + i18n + judge) +│ │ ├── content.go # /v1/score/content handler +│ │ ├── model.go # /v1/score/model handler +│ │ ├── imprint.go # /v1/score/imprint handler +│ │ ├── full.go # /v1/score/full handler +│ │ ├── retrieve.go # /v1/score/{id} handler +│ │ └── types.go # Request/Response DTOs +│ ├── meter/ +│ │ ├── usage.go # Usage recording (JSONL append) +│ │ └── middleware.go # Rate limit check middleware +│ └── subsystem/ +│ └── mcp.go # go-ai Subsystem for MCP tool access +├── config/ +│ └── defaults.yaml # Default rate limits, model paths, thresholds +├── Taskfile.yml +├── go.mod +└── CLAUDE.md +``` + +--- + +## Dependencies + +``` +forge.lthn.ai/lthn/eaas +├── forge.lthn.ai/core/go-api # REST framework, Authentik middleware, Response[T] +├── forge.lthn.ai/core/go-ai # MCP Subsystem interface +├── forge.lthn.ai/core/go-ml # ML service, judge backend +├── forge.lthn.ai/core/go-inference # TextModel, backends +├── forge.lthn.ai/core/go-i18n # Grammar reversal, imprint analysis +├── forge.lthn.ai/core/go-ratelimit # Per-key rate limiting +└── forge.lthn.ai/core/LEM # pkg/lem scoring engine (import path TBC) +``` + +--- + +## Deployment + +### de1 (production) + +```yaml +# docker-compose or direct binary +eaas: + binary: /opt/eaas/eaas + port: 8009 + env: + EAAS_JUDGE_MODEL: /models/LEM-Gemma3-4B + EAAS_JUDGE_12B: /models/LEM-Gemma3-12B + EAAS_USAGE_LOG: /var/log/eaas/usage.jsonl + EAAS_RATE_CONFIG: /etc/eaas/ratelimits.yaml + +# Traefik routing +traefik: + rule: Host(`api.lthn.ai`) && PathPrefix(`/v1/score`) + middlewares: authentik-forward-auth + service: eaas:8009 +``` + +### Local development + +```bash +task dev # Runs with local models, no auth +task test # Unit tests (mocked backends) +task build # Production binary +``` + +--- + +## Dog-Food Integration + +### lem-scorer replacement + +The existing `lem-scorer` binary (compiled Go, runs locally) calls the same `pkg/lem` functions. Once the API is live, training scripts can optionally call the API instead: + +```python +# Before (local binary) +result = subprocess.run(['/tmp/lem-scorer', '-format=training', ...]) + +# After (API call, optional) +result = requests.post('https://api.lthn.ai/v1/score/model', json={...}, headers={...}) +``` + +Both paths call the same scoring engine. The API just adds auth, metering, and network access. + +### LEM training pipeline + +During distillation, the API can score candidate responses in real-time: +- Score each distilled response via `/v1/score/content` +- Gate quality: only keep responses above threshold +- Track scoring metrics across training runs via usage logs + +--- + +## MVP Scope + +### Phase 1 — Ship It (1-2 weeks) + +- [ ] Repo scaffolding (cmd, pkg, config, Taskfile) +- [ ] `ScoringService` wrapping `lem.ScoreHeuristic()` + `go-i18n/reversal` +- [ ] `POST /v1/score/content` endpoint (heuristic only, no GPU) +- [ ] `POST /v1/score/imprint` endpoint (grammar fingerprint) +- [ ] `GET /v1/health` endpoint +- [ ] Authentik middleware (go-api integration) +- [ ] Usage metering (JSONL append) +- [ ] Rate limit checks (go-ratelimit) +- [ ] Deploy to de1 behind Traefik +- [ ] Dog-food: call from LEM training scripts + +### Phase 2 — Judge Integration (week 3) + +- [ ] Wire LEM-Gemma3-4B as judge backend +- [ ] `POST /v1/score/model` endpoint (heuristic + semantic + delta) +- [ ] `POST /v1/score/full` endpoint (all suites) +- [ ] `GET /v1/score/{id}` result retrieval +- [ ] MCP Subsystem for AI agent access + +### Phase 3 — Polish (week 4+) + +- [ ] Sycophancy detection (echo ratio, agreement bias) +- [ ] OpenAPI spec generation +- [ ] Batch endpoint (`POST /v1/score/batch`) +- [ ] Dashboard (optional, low priority — API-first) +- [ ] SDK/client libraries (Python, TypeScript) + +--- + +## Success Criteria + +1. `/v1/score/content` returns a score for any text in under 50ms +2. `/v1/score/imprint` returns grammar fingerprint in under 20ms +3. `/v1/score/model` with judge returns semantic scores in under 5s +4. Authentik API keys gate access correctly per tier +5. Usage logs capture every request for Blesta billing +6. lem-scorer training pipeline can call the API as an alternative to local binary +7. LEM-Gemma3-4B (25th IF worldwide) serves as the judge model