13 KiB
RFC-0002: TRIX Binary Container Format
Status: Standards Track Version: 2.0 Created: 2025-01-13 Author: Snider
Abstract
This document specifies the TRIX binary container format, a generic and extensible file format designed to store arbitrary binary payloads alongside structured JSON metadata. The format is protocol-agnostic, supporting any encryption scheme, compression algorithm, or data transformation while providing a consistent structure for metadata discovery and payload extraction.
Table of Contents
- Introduction
- Terminology
- Format Specification
- Header Specification
- Encoding Process
- Decoding Process
- Checksum Verification
- Magic Number Registry
- Security Considerations
- IANA Considerations
- References
1. Introduction
The TRIX format addresses the need for a simple, self-describing binary container that can wrap any payload type with extensible metadata. Unlike format-specific containers (such as encrypted archive formats), TRIX separates the concerns of:
- Container structure: How data is organized on disk/wire
- Payload semantics: What the payload contains and how to process it
- Metadata extensibility: Application-specific attributes
1.1 Design Goals
- Simplicity: Minimal overhead, easy to implement
- Extensibility: JSON header allows arbitrary metadata
- Protocol-agnostic: No assumptions about payload encryption or encoding
- Streaming-friendly: Header length prefix enables streaming reads
- Magic-number customizable: Applications can define their own identifiers
1.2 Use Cases
- Encrypted data interchange
- Signed document containers
- Configuration file packaging
- Backup archive format
- Inter-service message envelopes
2. Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.
Container: A complete TRIX-formatted byte sequence Magic Number: A 4-byte identifier at the start of the container Header: A JSON object containing metadata about the payload Payload: The arbitrary binary data stored in the container Checksum: An optional integrity verification value
3. Format Specification
3.1 Overview
A TRIX container consists of five sequential fields:
+----------------+---------+---------------+----------------+-----------+
| Magic Number | Version | Header Length | JSON Header | Payload |
+----------------+---------+---------------+----------------+-----------+
| 4 bytes | 1 byte | 4 bytes | Variable | Variable |
Total minimum size: 9 bytes (empty header, empty payload)
3.2 Field Definitions
3.2.1 Magic Number (4 bytes)
A 4-byte ASCII string identifying the file type. This field:
- MUST be exactly 4 bytes
- SHOULD contain printable ASCII characters
- Is application-defined (not mandated by this specification)
Common conventions:
TRIX- Generic TRIX container- First character uppercase, application-specific identifier
3.2.2 Version (1 byte)
An unsigned 8-bit integer indicating the format version.
| Value | Description |
|---|---|
| 0x00 | Reserved |
| 0x01 | Version 1.0 (deprecated) |
| 0x02 | Version 2.0 (current) |
| 0x03-0xFF | Reserved for future versions |
Implementations MUST reject containers with unrecognized versions.
3.2.3 Header Length (4 bytes)
A 32-bit unsigned integer in big-endian byte order specifying the length of the JSON Header in bytes.
- Minimum value: 0 (empty header represented as
{}is 2 bytes, but 0 is valid) - Maximum value: 16,777,215 (16 MB - 1 byte)
Implementations MUST reject headers exceeding 16 MB to prevent denial-of-service attacks.
Header Length = BigEndian32(length_of_json_header_bytes)
3.2.4 JSON Header (Variable)
A UTF-8 encoded JSON object containing metadata. The header:
- MUST be valid JSON (RFC 8259)
- MUST be a JSON object (not array, string, or primitive)
- SHOULD use UTF-8 encoding without BOM
- MAY be empty (
{})
3.2.5 Payload (Variable)
The arbitrary binary payload. The payload:
- MAY be empty (zero bytes)
- MAY contain any binary data
- Length is implicitly determined by:
container_length - 9 - header_length
4. Header Specification
4.1 Reserved Header Fields
The following header fields have defined semantics:
| Field | Type | Description |
|---|---|---|
content_type |
string | MIME type of the payload (before any transformations) |
checksum |
string | Hex-encoded checksum of the payload |
checksum_algo |
string | Algorithm used for checksum (e.g., "sha256") |
created_at |
string | ISO 8601 timestamp of creation |
encryption_algorithm |
string | Encryption algorithm identifier |
compression |
string | Compression algorithm identifier |
sigils |
array | Ordered list of transformation sigil names |
4.2 Extension Fields
Applications MAY include additional fields. To avoid conflicts:
- Custom fields SHOULD use a namespace prefix (e.g.,
x-myapp-field) - Standard field names are lowercase with underscores
4.3 Example Headers
Encrypted payload:
{
"content_type": "application/octet-stream",
"encryption_algorithm": "xchacha20poly1305",
"created_at": "2025-01-13T12:00:00Z"
}
Compressed and encoded payload:
{
"content_type": "text/plain",
"compression": "gzip",
"sigils": ["gzip", "base64"],
"checksum": "a591a6d40bf420404a011733cfb7b190d62c65bf0bcda32b57b277d9ad9f146e",
"checksum_algo": "sha256"
}
Minimal header:
{}
5. Encoding Process
5.1 Algorithm
function Encode(payload: bytes, header: object, magic: string) -> bytes:
// Validate magic number
if length(magic) != 4:
return error("magic number must be 4 bytes")
// Serialize header to JSON
header_bytes = JSON.serialize(header)
header_length = length(header_bytes)
// Validate header size
if header_length > 16777215:
return error("header exceeds maximum size")
// Build container
container = empty byte buffer
// Write magic number (4 bytes)
container.write(magic)
// Write version (1 byte)
container.write(0x02)
// Write header length (4 bytes, big-endian)
container.write(BigEndian32(header_length))
// Write JSON header
container.write(header_bytes)
// Write payload
container.write(payload)
return container.bytes()
5.2 Checksum Integration
If integrity verification is required:
function EncodeWithChecksum(payload: bytes, header: object, magic: string, algo: string) -> bytes:
checksum = Hash(algo, payload)
header["checksum"] = HexEncode(checksum)
header["checksum_algo"] = algo
return Encode(payload, header, magic)
6. Decoding Process
6.1 Algorithm
function Decode(container: bytes, expected_magic: string) -> (header: object, payload: bytes):
// Validate minimum size
if length(container) < 9:
return error("container too small")
// Read and verify magic number
magic = container[0:4]
if magic != expected_magic:
return error("invalid magic number")
// Read and verify version
version = container[4]
if version != 0x02:
return error("unsupported version")
// Read header length
header_length = BigEndian32(container[5:9])
// Validate header length
if header_length > 16777215:
return error("header length exceeds maximum")
if length(container) < 9 + header_length:
return error("container truncated")
// Read and parse header
header_bytes = container[9:9+header_length]
header = JSON.parse(header_bytes)
// Read payload
payload = container[9+header_length:]
return (header, payload)
6.2 Streaming Decode
For large files, streaming decode is RECOMMENDED:
function StreamDecode(reader: Reader, expected_magic: string) -> (header: object, payload_reader: Reader):
// Read fixed-size prefix
prefix = reader.read(9)
// Validate magic and version
magic = prefix[0:4]
version = prefix[4]
header_length = BigEndian32(prefix[5:9])
// Read header
header_bytes = reader.read(header_length)
header = JSON.parse(header_bytes)
// Return remaining reader for payload streaming
return (header, reader)
7. Checksum Verification
7.1 Supported Algorithms
| Algorithm ID | Output Size | Notes |
|---|---|---|
md5 |
16 bytes | NOT RECOMMENDED for security |
sha1 |
20 bytes | NOT RECOMMENDED for security |
sha256 |
32 bytes | RECOMMENDED |
sha384 |
48 bytes | |
sha512 |
64 bytes | |
blake2b-256 |
32 bytes | |
blake2b-512 |
64 bytes |
7.2 Verification Process
function VerifyChecksum(header: object, payload: bytes) -> bool:
if "checksum" not in header:
return true // No checksum to verify
algo = header["checksum_algo"]
expected = HexDecode(header["checksum"])
actual = Hash(algo, payload)
return constant_time_compare(expected, actual)
8. Magic Number Registry
This section defines conventions for magic number allocation:
8.1 Reserved Magic Numbers
| Magic | Reserved For |
|---|---|
TRIX |
Generic TRIX containers |
\x00\x00\x00\x00 |
Reserved (null) |
\xFF\xFF\xFF\xFF |
Reserved (test/invalid) |
8.2 Registered Magic Numbers
The following magic numbers are registered for specific applications:
| Magic | Application | Description |
|---|---|---|
SMSG |
Borg | Encrypted message/media container |
STIM |
Borg | Encrypted TIM container bundle |
STMF |
Borg | Secure To-Me Form (encrypted form data) |
TRIX |
Borg | Encrypted DataNode archive |
8.3 Allocation Guidelines
Applications SHOULD:
- Use 4 printable ASCII characters
- Start with an uppercase letter
- Avoid common file format magic numbers (e.g.,
%PDF,PK\x03\x04) - Register custom magic numbers in their documentation
9. Security Considerations
9.1 Header Injection
The JSON header is parsed before processing. Implementations MUST:
- Validate JSON syntax strictly
- Reject headers with duplicate keys
- Not execute header field values as code
9.2 Denial of Service
The 16 MB header limit prevents memory exhaustion attacks. Implementations SHOULD:
- Reject headers before full allocation if length exceeds limit
- Implement timeouts for header parsing
- Limit recursion depth in JSON parsing
9.3 Path Traversal
Header fields like filename MUST NOT be used directly for filesystem operations without sanitization.
9.4 Checksum Security
- MD5 and SHA1 checksums provide integrity but not authenticity
- For tamper detection, use HMAC or digital signatures
- Checksum verification MUST use constant-time comparison
9.5 Version Negotiation
Implementations MUST NOT attempt to parse containers with unknown versions, as the format may change incompatibly.
10. IANA Considerations
This document does not require IANA actions. The TRIX format is application-defined and does not use IANA-managed namespaces.
Future versions may define:
- Media type registration (e.g.,
application/x-trix) - Magic number registry
11. Future Work
- Media type registration (
application/x-trix,application/x-smsg, etc.) - Formal magic number registry with registration process
- Streaming encoding/decoding for large payloads
- Header compression for bandwidth-constrained environments
- Sub-container nesting specification (Trix within Trix)
12. References
- [RFC 8259] The JavaScript Object Notation (JSON) Data Interchange Format
- [RFC 2119] Key words for use in RFCs to Indicate Requirement Levels
- [RFC 6838] Media Type Specifications and Registration Procedures
Appendix A: Binary Layout Diagram
Byte offset: 0 4 5 9 9+H 9+H+P
|---------|----|---------|---------|---------|
| Magic | V | HdrLen | Header | Payload |
| (4) |(1) | (4) | (H) | (P) |
|---------|----|---------|---------|---------|
V = Version byte
H = Header length (from HdrLen field)
P = Payload length (remaining bytes)
Appendix B: Reference Implementation
A reference implementation in Go is available at:
github.com/Snider/Enchantrix/pkg/trix/trix.go
Appendix C: Changelog
- 2.0 (2025-01-13): Current version with JSON header
- 1.0 (deprecated): Initial version with fixed header fields