# RFC-0003: Sigil Transformation Framework **Status:** Standards Track **Version:** 1.0 **Created:** 2025-01-13 **Author:** Snider ## Abstract This document specifies the Sigil Transformation Framework, a composable interface for defining reversible and irreversible data transformations. Sigils provide a uniform abstraction for encoding, compression, hashing, encryption, and other byte-level operations, enabling declarative transformation pipelines that can be applied and reversed systematically. ## Table of Contents 1. [Introduction](#1-introduction) 2. [Terminology](#2-terminology) 3. [Interface Specification](#3-interface-specification) 4. [Sigil Categories](#4-sigil-categories) 5. [Standard Sigils](#5-standard-sigils) 6. [Composition and Chaining](#6-composition-and-chaining) 7. [Error Handling](#7-error-handling) 8. [Implementation Guidelines](#8-implementation-guidelines) 9. [Security Considerations](#9-security-considerations) 10. [References](#10-references) ## 1. Introduction Data transformation is a fundamental operation in software systems. Common transformations include: - **Encoding**: Converting between representations (hex, base64) - **Compression**: Reducing data size (gzip, zstd) - **Encryption**: Protecting confidentiality (AES, ChaCha20) - **Hashing**: Computing digests (SHA-256, BLAKE2) - **Formatting**: Restructuring data (JSON minification) The Sigil framework provides a uniform interface for all these operations, enabling: - Declarative transformation pipelines - Automatic reversal of transformation chains - Composable, reusable transformation units - Clear semantics for reversible vs. irreversible operations ### 1.1 Design Principles 1. **Simplicity**: Two methods, clear contract 2. **Composability**: Sigils combine naturally 3. **Reversibility awareness**: Explicit handling of one-way operations 4. **Null safety**: Defined behavior for nil/empty inputs 5. **Error propagation**: Clear error semantics ## 2. Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119. **Sigil**: A transformation unit implementing the Sigil interface **In operation**: The forward transformation (encode, compress, encrypt, hash) **Out operation**: The reverse transformation (decode, decompress, decrypt) **Reversible sigil**: A sigil where Out(In(x)) = x for all valid x **Irreversible sigil**: A sigil where Out returns the input unchanged or errors **Symmetric sigil**: A sigil where In(x) = Out(x) (e.g., byte reversal) **Transmutation**: Applying a sequence of sigils to data ## 3. Interface Specification ### 3.1 Sigil Interface ``` interface Sigil { // In transforms the data (forward operation). // Returns transformed data and any error encountered. In(data: bytes) -> (bytes, error) // Out reverses the transformation (reverse operation). // For irreversible sigils, returns data unchanged. Out(data: bytes) -> (bytes, error) } ``` ### 3.2 Method Contracts #### 3.2.1 In Method The `In` method MUST: - Accept a byte slice as input - Return a byte slice as output - Return nil output for nil input (without error) - Return empty slice for empty input (without error) - Return an error if transformation fails #### 3.2.2 Out Method The `Out` method MUST: - Accept a byte slice as input - Return a byte slice as output - Return nil output for nil input (without error) - Return empty slice for empty input (without error) - For reversible sigils: return the original data before `In` was applied - For irreversible sigils: return the input unchanged (passthrough) ### 3.3 Transmute Function The framework provides a helper function for applying multiple sigils: ``` function Transmute(data: bytes, sigils: Sigil[]) -> (bytes, error): for each sigil in sigils: data, err = sigil.In(data) if err != nil: return nil, err return data, nil ``` ## 4. Sigil Categories ### 4.1 Reversible Sigils Reversible sigils can recover the original input from the output. **Property**: For any valid input `x`: ``` sigil.Out(sigil.In(x)) == x ``` Examples: - Encoding sigils (hex, base64) - Compression sigils (gzip) - Encryption sigils (ChaCha20-Poly1305) ### 4.2 Irreversible Sigils Irreversible sigils perform one-way transformations. **Property**: The `Out` method returns input unchanged: ``` sigil.Out(x) == x ``` Examples: - Hash sigils (SHA-256, MD5) - Truncation sigils ### 4.3 Symmetric Sigils Symmetric sigils have identical `In` and `Out` operations. **Property**: For any input `x`: ``` sigil.In(x) == sigil.Out(x) ``` Examples: - Byte reversal - XOR with fixed key - Bitwise NOT ## 5. Standard Sigils ### 5.1 Encoding Sigils #### 5.1.1 Hex Sigil Encodes data to hexadecimal representation. | Property | Value | |----------|-------| | Name | `hex` | | Category | Reversible | | In | Binary to hex ASCII | | Out | Hex ASCII to binary | | Output expansion | 2x | ``` In("Hello") -> "48656c6c6f" Out("48656c6c6f") -> "Hello" ``` #### 5.1.2 Base64 Sigil Encodes data to Base64 representation (RFC 4648). | Property | Value | |----------|-------| | Name | `base64` | | Category | Reversible | | In | Binary to Base64 ASCII | | Out | Base64 ASCII to binary | | Output expansion | ~1.33x | ``` In("Hello") -> "SGVsbG8=" Out("SGVsbG8=") -> "Hello" ``` ### 5.2 Transformation Sigils #### 5.2.1 Reverse Sigil Reverses the byte order of the data. | Property | Value | |----------|-------| | Name | `reverse` | | Category | Symmetric | | In | Reverse bytes | | Out | Reverse bytes | | Output expansion | 1x | ``` In("Hello") -> "olleH" Out("olleH") -> "Hello" ``` ### 5.3 Compression Sigils #### 5.3.1 Gzip Sigil Compresses data using gzip (RFC 1952). | Property | Value | |----------|-------| | Name | `gzip` | | Category | Reversible | | In | Compress | | Out | Decompress | | Output expansion | Variable (typically < 1x) | ### 5.4 Formatting Sigils #### 5.4.1 JSON Sigil Compacts JSON data by removing whitespace. | Property | Value | |----------|-------| | Name | `json` | | Category | Reversible* | | In | Compact JSON | | Out | Passthrough | *Note: Whitespace is not recoverable; Out returns input unchanged. #### 5.4.2 JSON-Indent Sigil Pretty-prints JSON data with indentation. | Property | Value | |----------|-------| | Name | `json-indent` | | Category | Reversible* | | In | Indent JSON (2 spaces) | | Out | Passthrough | ### 5.5 Encryption Sigils Encryption sigils provide authenticated encryption using AEAD ciphers. #### 5.5.1 ChaCha20-Poly1305 Sigil Encrypts data using XChaCha20-Poly1305 authenticated encryption. | Property | Value | |----------|-------| | Name | `chacha20poly1305` | | Category | Reversible | | Key size | 32 bytes | | Nonce size | 24 bytes (XChaCha variant) | | Tag size | 16 bytes | | In | Encrypt (generates nonce, prepends to output) | | Out | Decrypt (extracts nonce from input prefix) | **Critical Implementation Detail**: The nonce is embedded IN the ciphertext output, not transmitted separately: ``` In(plaintext) -> [24-byte nonce][ciphertext][16-byte tag] Out(ciphertext_with_nonce) -> plaintext ``` **Construction**: ```go sigil, err := NewChaChaPolySigil(key) // key must be 32 bytes ciphertext, err := sigil.In(plaintext) plaintext, err := sigil.Out(ciphertext) ``` **Security Properties**: - Authenticated: Poly1305 MAC prevents tampering - Confidential: ChaCha20 stream cipher - Nonce uniqueness: Random 24-byte nonce per encryption - No nonce management required by caller ### 5.6 Hash Sigils Hash sigils compute cryptographic digests. They are irreversible. | Name | Algorithm | Output Size | |------|-----------|-------------| | `md4` | MD4 | 16 bytes | | `md5` | MD5 | 16 bytes | | `sha1` | SHA-1 | 20 bytes | | `sha224` | SHA-224 | 28 bytes | | `sha256` | SHA-256 | 32 bytes | | `sha384` | SHA-384 | 48 bytes | | `sha512` | SHA-512 | 64 bytes | | `sha3-224` | SHA3-224 | 28 bytes | | `sha3-256` | SHA3-256 | 32 bytes | | `sha3-384` | SHA3-384 | 48 bytes | | `sha3-512` | SHA3-512 | 64 bytes | | `sha512-224` | SHA-512/224 | 28 bytes | | `sha512-256` | SHA-512/256 | 32 bytes | | `ripemd160` | RIPEMD-160 | 20 bytes | | `blake2s-256` | BLAKE2s | 32 bytes | | `blake2b-256` | BLAKE2b | 32 bytes | | `blake2b-384` | BLAKE2b | 48 bytes | | `blake2b-512` | BLAKE2b | 64 bytes | For all hash sigils: - `In(data)` returns the hash digest as raw bytes - `Out(data)` returns data unchanged (passthrough) ## 6. Composition and Chaining ### 6.1 Forward Chain (Packing) Sigils are applied left-to-right: ``` sigils = [gzip, base64, hex] result = Transmute(data, sigils) // Equivalent to: result = hex.In(base64.In(gzip.In(data))) ``` ### 6.2 Reverse Chain (Unpacking) To reverse a chain, apply `Out` in reverse order: ``` function ReverseTransmute(data: bytes, sigils: Sigil[]) -> (bytes, error): for i = length(sigils) - 1 downto 0: data, err = sigils[i].Out(data) if err != nil: return nil, err return data, nil ``` ### 6.3 Chain Properties For a chain of reversible sigils `[s1, s2, s3]`: ``` original = ReverseTransmute(Transmute(data, [s1, s2, s3]), [s1, s2, s3]) // original == data ``` ### 6.4 Mixed Chains Chains MAY contain both reversible and irreversible sigils: ``` sigils = [gzip, sha256] // sha256 is irreversible packed = Transmute(data, sigils) // packed is the SHA-256 hash of gzip-compressed data unpacked = ReverseTransmute(packed, sigils) // unpacked == packed (sha256.Out is passthrough) ``` ## 7. Error Handling ### 7.1 Error Categories | Category | Description | Recovery | |----------|-------------|----------| | Input error | Invalid input format | Check input validity | | State error | Sigil not properly configured | Initialize sigil | | Resource error | Memory/IO failure | Retry or abort | | Algorithm error | Cryptographic failure | Check keys/params | ### 7.2 Error Propagation Errors MUST propagate immediately: ``` function Transmute(data: bytes, sigils: Sigil[]) -> (bytes, error): for each sigil in sigils: data, err = sigil.In(data) if err != nil: return nil, err // Stop immediately return data, nil ``` ### 7.3 Partial Results On error, implementations MUST NOT return partial results. Either: - Return complete transformed data, or - Return nil with an error ## 8. Implementation Guidelines ### 8.1 Sigil Factory Implementations SHOULD provide a factory function: ``` function NewSigil(name: string) -> (Sigil, error): switch name: case "hex": return new HexSigil() case "base64": return new Base64Sigil() case "gzip": return new GzipSigil() // ... etc default: return nil, error("unknown sigil: " + name) ``` ### 8.2 Null Safety ``` function In(data: bytes) -> (bytes, error): if data == nil: return nil, nil // NOT an error if length(data) == 0: return [], nil // Empty slice, NOT nil // ... perform transformation ``` ### 8.3 Immutability Sigils SHOULD NOT modify the input slice: ``` // CORRECT: Create new slice result := make([]byte, len(data)) // ... transform into result // INCORRECT: Modify in place data[0] = transformed // Don't do this ``` ### 8.4 Thread Safety Sigils SHOULD be safe for concurrent use: - Avoid mutable state in sigil instances - Use synchronization if state is required - Document thread-safety guarantees ## 9. Security Considerations ### 9.1 Hash Sigil Security - MD4, MD5, SHA1 are cryptographically broken for collision resistance - Use SHA-256 or stronger for security-critical applications - Hash sigils do NOT provide authentication ### 9.2 Compression Oracle Attacks When combining compression and encryption sigils: - Be aware of CRIME/BREACH-style attacks - Do not compress data containing secrets alongside attacker-controlled data ### 9.3 Memory Safety - Validate output buffer sizes before allocation - Implement maximum input size limits - Handle decompression bombs (zip bombs) ### 9.4 Timing Attacks - Comparison operations should be constant-time where security-relevant - Hash comparisons should use constant-time comparison functions ## 10. Future Work - [ ] AES-GCM encryption sigil for environments requiring AES - [ ] Zstd compression sigil with configurable compression levels - [ ] Streaming sigil interface for large data processing - [ ] Sigil metadata interface for reporting transformation properties - [ ] WebAssembly compilation for browser-based sigil operations - [ ] Hardware acceleration detection and utilization ## 11. References - [RFC 4648] The Base16, Base32, and Base64 Data Encodings - [RFC 1952] GZIP file format specification - [RFC 8259] The JavaScript Object Notation (JSON) Data Interchange Format - [FIPS 180-4] Secure Hash Standard - [FIPS 202] SHA-3 Standard - [RFC 8439] ChaCha20 and Poly1305 for IETF Protocols --- ## Appendix A: Sigil Name Registry | Name | Category | Reversible | Notes | |------|----------|------------|-------| | `reverse` | Transform | Yes (symmetric) | Byte reversal | | `hex` | Encoding | Yes | Hexadecimal | | `base64` | Encoding | Yes | RFC 4648 | | `gzip` | Compression | Yes | RFC 1952 | | `zstd` | Compression | Yes | Zstandard | | `json` | Formatting | Partial | Compacts JSON | | `json-indent` | Formatting | Partial | Pretty-prints JSON | | `chacha20poly1305` | Encryption | Yes | XChaCha20-Poly1305 AEAD | | `md4` | Hash | No | 128-bit | | `md5` | Hash | No | 128-bit | | `sha1` | Hash | No | 160-bit | | `sha224` | Hash | No | 224-bit | | `sha256` | Hash | No | 256-bit | | `sha384` | Hash | No | 384-bit | | `sha512` | Hash | No | 512-bit | | `sha3-*` | Hash | No | SHA-3 family | | `sha512-*` | Hash | No | SHA-512 truncated | | `ripemd160` | Hash | No | 160-bit | | `blake2s-256` | Hash | No | 256-bit | | `blake2b-*` | Hash | No | BLAKE2b family | ## Appendix B: Reference Implementation A reference implementation in Go is available at: - Interface: `github.com/Snider/Enchantrix/pkg/enchantrix/enchantrix.go` - Standard sigils: `github.com/Snider/Enchantrix/pkg/enchantrix/sigils.go` ## Appendix C: Custom Sigil Example ```go // ROT13Sigil implements a simple letter rotation cipher. type ROT13Sigil struct{} func (s *ROT13Sigil) In(data []byte) ([]byte, error) { if data == nil { return nil, nil } result := make([]byte, len(data)) for i, b := range data { if b >= 'A' && b <= 'Z' { result[i] = 'A' + (b-'A'+13)%26 } else if b >= 'a' && b <= 'z' { result[i] = 'a' + (b-'a'+13)%26 } else { result[i] = b } } return result, nil } func (s *ROT13Sigil) Out(data []byte) ([]byte, error) { return s.In(data) // ROT13 is symmetric } ``` ## Appendix D: Changelog - **1.0** (2025-01-13): Initial specification