Enchantrix/rfcs/RFC-0003-Sigil-Transformation-Framework.md

556 lines
15 KiB
Markdown

# RFC-0003: Sigil Transformation Framework
**Status:** Standards Track
**Version:** 1.0
**Created:** 2025-01-13
**Author:** Snider
## Abstract
This document specifies the Sigil Transformation Framework, a composable interface for defining reversible and irreversible data transformations. Sigils provide a uniform abstraction for encoding, compression, hashing, encryption, and other byte-level operations, enabling declarative transformation pipelines that can be applied and reversed systematically.
## Table of Contents
1. [Introduction](#1-introduction)
2. [Terminology](#2-terminology)
3. [Interface Specification](#3-interface-specification)
4. [Sigil Categories](#4-sigil-categories)
5. [Standard Sigils](#5-standard-sigils)
6. [Composition and Chaining](#6-composition-and-chaining)
7. [Error Handling](#7-error-handling)
8. [Implementation Guidelines](#8-implementation-guidelines)
9. [Security Considerations](#9-security-considerations)
10. [References](#10-references)
## 1. Introduction
Data transformation is a fundamental operation in software systems. Common transformations include:
- **Encoding**: Converting between representations (hex, base64)
- **Compression**: Reducing data size (gzip, zstd)
- **Encryption**: Protecting confidentiality (AES, ChaCha20)
- **Hashing**: Computing digests (SHA-256, BLAKE2)
- **Formatting**: Restructuring data (JSON minification)
The Sigil framework provides a uniform interface for all these operations, enabling:
- Declarative transformation pipelines
- Automatic reversal of transformation chains
- Composable, reusable transformation units
- Clear semantics for reversible vs. irreversible operations
### 1.1 Design Principles
1. **Simplicity**: Two methods, clear contract
2. **Composability**: Sigils combine naturally
3. **Reversibility awareness**: Explicit handling of one-way operations
4. **Null safety**: Defined behavior for nil/empty inputs
5. **Error propagation**: Clear error semantics
## 2. Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.
**Sigil**: A transformation unit implementing the Sigil interface
**In operation**: The forward transformation (encode, compress, encrypt, hash)
**Out operation**: The reverse transformation (decode, decompress, decrypt)
**Reversible sigil**: A sigil where Out(In(x)) = x for all valid x
**Irreversible sigil**: A sigil where Out returns the input unchanged or errors
**Symmetric sigil**: A sigil where In(x) = Out(x) (e.g., byte reversal)
**Transmutation**: Applying a sequence of sigils to data
## 3. Interface Specification
### 3.1 Sigil Interface
```
interface Sigil {
// In transforms the data (forward operation).
// Returns transformed data and any error encountered.
In(data: bytes) -> (bytes, error)
// Out reverses the transformation (reverse operation).
// For irreversible sigils, returns data unchanged.
Out(data: bytes) -> (bytes, error)
}
```
### 3.2 Method Contracts
#### 3.2.1 In Method
The `In` method MUST:
- Accept a byte slice as input
- Return a byte slice as output
- Return nil output for nil input (without error)
- Return empty slice for empty input (without error)
- Return an error if transformation fails
#### 3.2.2 Out Method
The `Out` method MUST:
- Accept a byte slice as input
- Return a byte slice as output
- Return nil output for nil input (without error)
- Return empty slice for empty input (without error)
- For reversible sigils: return the original data before `In` was applied
- For irreversible sigils: return the input unchanged (passthrough)
### 3.3 Transmute Function
The framework provides a helper function for applying multiple sigils:
```
function Transmute(data: bytes, sigils: Sigil[]) -> (bytes, error):
for each sigil in sigils:
data, err = sigil.In(data)
if err != nil:
return nil, err
return data, nil
```
## 4. Sigil Categories
### 4.1 Reversible Sigils
Reversible sigils can recover the original input from the output.
**Property**: For any valid input `x`:
```
sigil.Out(sigil.In(x)) == x
```
Examples:
- Encoding sigils (hex, base64)
- Compression sigils (gzip)
- Encryption sigils (ChaCha20-Poly1305)
### 4.2 Irreversible Sigils
Irreversible sigils perform one-way transformations.
**Property**: The `Out` method returns input unchanged:
```
sigil.Out(x) == x
```
Examples:
- Hash sigils (SHA-256, MD5)
- Truncation sigils
### 4.3 Symmetric Sigils
Symmetric sigils have identical `In` and `Out` operations.
**Property**: For any input `x`:
```
sigil.In(x) == sigil.Out(x)
```
Examples:
- Byte reversal
- XOR with fixed key
- Bitwise NOT
## 5. Standard Sigils
### 5.1 Encoding Sigils
#### 5.1.1 Hex Sigil
Encodes data to hexadecimal representation.
| Property | Value |
|----------|-------|
| Name | `hex` |
| Category | Reversible |
| In | Binary to hex ASCII |
| Out | Hex ASCII to binary |
| Output expansion | 2x |
```
In("Hello") -> "48656c6c6f"
Out("48656c6c6f") -> "Hello"
```
#### 5.1.2 Base64 Sigil
Encodes data to Base64 representation (RFC 4648).
| Property | Value |
|----------|-------|
| Name | `base64` |
| Category | Reversible |
| In | Binary to Base64 ASCII |
| Out | Base64 ASCII to binary |
| Output expansion | ~1.33x |
```
In("Hello") -> "SGVsbG8="
Out("SGVsbG8=") -> "Hello"
```
### 5.2 Transformation Sigils
#### 5.2.1 Reverse Sigil
Reverses the byte order of the data.
| Property | Value |
|----------|-------|
| Name | `reverse` |
| Category | Symmetric |
| In | Reverse bytes |
| Out | Reverse bytes |
| Output expansion | 1x |
```
In("Hello") -> "olleH"
Out("olleH") -> "Hello"
```
### 5.3 Compression Sigils
#### 5.3.1 Gzip Sigil
Compresses data using gzip (RFC 1952).
| Property | Value |
|----------|-------|
| Name | `gzip` |
| Category | Reversible |
| In | Compress |
| Out | Decompress |
| Output expansion | Variable (typically < 1x) |
### 5.4 Formatting Sigils
#### 5.4.1 JSON Sigil
Compacts JSON data by removing whitespace.
| Property | Value |
|----------|-------|
| Name | `json` |
| Category | Reversible* |
| In | Compact JSON |
| Out | Passthrough |
*Note: Whitespace is not recoverable; Out returns input unchanged.
#### 5.4.2 JSON-Indent Sigil
Pretty-prints JSON data with indentation.
| Property | Value |
|----------|-------|
| Name | `json-indent` |
| Category | Reversible* |
| In | Indent JSON (2 spaces) |
| Out | Passthrough |
### 5.5 Encryption Sigils
Encryption sigils provide authenticated encryption using AEAD ciphers.
#### 5.5.1 ChaCha20-Poly1305 Sigil
Encrypts data using XChaCha20-Poly1305 authenticated encryption.
| Property | Value |
|----------|-------|
| Name | `chacha20poly1305` |
| Category | Reversible |
| Key size | 32 bytes |
| Nonce size | 24 bytes (XChaCha variant) |
| Tag size | 16 bytes |
| In | Encrypt (generates nonce, prepends to output) |
| Out | Decrypt (extracts nonce from input prefix) |
**Critical Implementation Detail**: The nonce is embedded IN the ciphertext output, not transmitted separately:
```
In(plaintext) -> [24-byte nonce][ciphertext][16-byte tag]
Out(ciphertext_with_nonce) -> plaintext
```
**Construction**:
```go
sigil, err := NewChaChaPolySigil(key) // key must be 32 bytes
ciphertext, err := sigil.In(plaintext)
plaintext, err := sigil.Out(ciphertext)
```
**Security Properties**:
- Authenticated: Poly1305 MAC prevents tampering
- Confidential: ChaCha20 stream cipher
- Nonce uniqueness: Random 24-byte nonce per encryption
- No nonce management required by caller
### 5.6 Hash Sigils
Hash sigils compute cryptographic digests. They are irreversible.
| Name | Algorithm | Output Size |
|------|-----------|-------------|
| `md4` | MD4 | 16 bytes |
| `md5` | MD5 | 16 bytes |
| `sha1` | SHA-1 | 20 bytes |
| `sha224` | SHA-224 | 28 bytes |
| `sha256` | SHA-256 | 32 bytes |
| `sha384` | SHA-384 | 48 bytes |
| `sha512` | SHA-512 | 64 bytes |
| `sha3-224` | SHA3-224 | 28 bytes |
| `sha3-256` | SHA3-256 | 32 bytes |
| `sha3-384` | SHA3-384 | 48 bytes |
| `sha3-512` | SHA3-512 | 64 bytes |
| `sha512-224` | SHA-512/224 | 28 bytes |
| `sha512-256` | SHA-512/256 | 32 bytes |
| `ripemd160` | RIPEMD-160 | 20 bytes |
| `blake2s-256` | BLAKE2s | 32 bytes |
| `blake2b-256` | BLAKE2b | 32 bytes |
| `blake2b-384` | BLAKE2b | 48 bytes |
| `blake2b-512` | BLAKE2b | 64 bytes |
For all hash sigils:
- `In(data)` returns the hash digest as raw bytes
- `Out(data)` returns data unchanged (passthrough)
## 6. Composition and Chaining
### 6.1 Forward Chain (Packing)
Sigils are applied left-to-right:
```
sigils = [gzip, base64, hex]
result = Transmute(data, sigils)
// Equivalent to:
result = hex.In(base64.In(gzip.In(data)))
```
### 6.2 Reverse Chain (Unpacking)
To reverse a chain, apply `Out` in reverse order:
```
function ReverseTransmute(data: bytes, sigils: Sigil[]) -> (bytes, error):
for i = length(sigils) - 1 downto 0:
data, err = sigils[i].Out(data)
if err != nil:
return nil, err
return data, nil
```
### 6.3 Chain Properties
For a chain of reversible sigils `[s1, s2, s3]`:
```
original = ReverseTransmute(Transmute(data, [s1, s2, s3]), [s1, s2, s3])
// original == data
```
### 6.4 Mixed Chains
Chains MAY contain both reversible and irreversible sigils:
```
sigils = [gzip, sha256] // sha256 is irreversible
packed = Transmute(data, sigils)
// packed is the SHA-256 hash of gzip-compressed data
unpacked = ReverseTransmute(packed, sigils)
// unpacked == packed (sha256.Out is passthrough)
```
## 7. Error Handling
### 7.1 Error Categories
| Category | Description | Recovery |
|----------|-------------|----------|
| Input error | Invalid input format | Check input validity |
| State error | Sigil not properly configured | Initialize sigil |
| Resource error | Memory/IO failure | Retry or abort |
| Algorithm error | Cryptographic failure | Check keys/params |
### 7.2 Error Propagation
Errors MUST propagate immediately:
```
function Transmute(data: bytes, sigils: Sigil[]) -> (bytes, error):
for each sigil in sigils:
data, err = sigil.In(data)
if err != nil:
return nil, err // Stop immediately
return data, nil
```
### 7.3 Partial Results
On error, implementations MUST NOT return partial results. Either:
- Return complete transformed data, or
- Return nil with an error
## 8. Implementation Guidelines
### 8.1 Sigil Factory
Implementations SHOULD provide a factory function:
```
function NewSigil(name: string) -> (Sigil, error):
switch name:
case "hex": return new HexSigil()
case "base64": return new Base64Sigil()
case "gzip": return new GzipSigil()
// ... etc
default: return nil, error("unknown sigil: " + name)
```
### 8.2 Null Safety
```
function In(data: bytes) -> (bytes, error):
if data == nil:
return nil, nil // NOT an error
if length(data) == 0:
return [], nil // Empty slice, NOT nil
// ... perform transformation
```
### 8.3 Immutability
Sigils SHOULD NOT modify the input slice:
```
// CORRECT: Create new slice
result := make([]byte, len(data))
// ... transform into result
// INCORRECT: Modify in place
data[0] = transformed // Don't do this
```
### 8.4 Thread Safety
Sigils SHOULD be safe for concurrent use:
- Avoid mutable state in sigil instances
- Use synchronization if state is required
- Document thread-safety guarantees
## 9. Security Considerations
### 9.1 Hash Sigil Security
- MD4, MD5, SHA1 are cryptographically broken for collision resistance
- Use SHA-256 or stronger for security-critical applications
- Hash sigils do NOT provide authentication
### 9.2 Compression Oracle Attacks
When combining compression and encryption sigils:
- Be aware of CRIME/BREACH-style attacks
- Do not compress data containing secrets alongside attacker-controlled data
### 9.3 Memory Safety
- Validate output buffer sizes before allocation
- Implement maximum input size limits
- Handle decompression bombs (zip bombs)
### 9.4 Timing Attacks
- Comparison operations should be constant-time where security-relevant
- Hash comparisons should use constant-time comparison functions
## 10. Future Work
- [ ] AES-GCM encryption sigil for environments requiring AES
- [ ] Zstd compression sigil with configurable compression levels
- [ ] Streaming sigil interface for large data processing
- [ ] Sigil metadata interface for reporting transformation properties
- [ ] WebAssembly compilation for browser-based sigil operations
- [ ] Hardware acceleration detection and utilization
## 11. References
- [RFC 4648] The Base16, Base32, and Base64 Data Encodings
- [RFC 1952] GZIP file format specification
- [RFC 8259] The JavaScript Object Notation (JSON) Data Interchange Format
- [FIPS 180-4] Secure Hash Standard
- [FIPS 202] SHA-3 Standard
- [RFC 8439] ChaCha20 and Poly1305 for IETF Protocols
---
## Appendix A: Sigil Name Registry
| Name | Category | Reversible | Notes |
|------|----------|------------|-------|
| `reverse` | Transform | Yes (symmetric) | Byte reversal |
| `hex` | Encoding | Yes | Hexadecimal |
| `base64` | Encoding | Yes | RFC 4648 |
| `gzip` | Compression | Yes | RFC 1952 |
| `zstd` | Compression | Yes | Zstandard |
| `json` | Formatting | Partial | Compacts JSON |
| `json-indent` | Formatting | Partial | Pretty-prints JSON |
| `chacha20poly1305` | Encryption | Yes | XChaCha20-Poly1305 AEAD |
| `md4` | Hash | No | 128-bit |
| `md5` | Hash | No | 128-bit |
| `sha1` | Hash | No | 160-bit |
| `sha224` | Hash | No | 224-bit |
| `sha256` | Hash | No | 256-bit |
| `sha384` | Hash | No | 384-bit |
| `sha512` | Hash | No | 512-bit |
| `sha3-*` | Hash | No | SHA-3 family |
| `sha512-*` | Hash | No | SHA-512 truncated |
| `ripemd160` | Hash | No | 160-bit |
| `blake2s-256` | Hash | No | 256-bit |
| `blake2b-*` | Hash | No | BLAKE2b family |
## Appendix B: Reference Implementation
A reference implementation in Go is available at:
- Interface: `github.com/Snider/Enchantrix/pkg/enchantrix/enchantrix.go`
- Standard sigils: `github.com/Snider/Enchantrix/pkg/enchantrix/sigils.go`
## Appendix C: Custom Sigil Example
```go
// ROT13Sigil implements a simple letter rotation cipher.
type ROT13Sigil struct{}
func (s *ROT13Sigil) In(data []byte) ([]byte, error) {
if data == nil {
return nil, nil
}
result := make([]byte, len(data))
for i, b := range data {
if b >= 'A' && b <= 'Z' {
result[i] = 'A' + (b-'A'+13)%26
} else if b >= 'a' && b <= 'z' {
result[i] = 'a' + (b-'a'+13)%26
} else {
result[i] = b
}
}
return result, nil
}
func (s *ROT13Sigil) Out(data []byte) ([]byte, error) {
return s.In(data) // ROT13 is symmetric
}
```
## Appendix D: Changelog
- **1.0** (2025-01-13): Initial specification