Version: 0.0
Date: December 2025
Authors: Bogo Development Team
- Overview
- Design Principles
- Format Structure
- Type System
- Encoding Specifications
- Examples
- Implementation Notes
- Version History
Bogo is a compact, efficient binary serialization format designed for high-performance data interchange. It provides a type-safe, schema-less encoding system optimized for both space efficiency and parsing speed.
- Compact Binary Format: Minimal overhead with variable-length encoding
- Type Safety: Strong typing system with explicit type information
- Cross-Platform: Little-endian encoding with well-defined byte layouts
- Extensible: Versioned format supporting future enhancements
- High Performance: Optimized for fast encoding/decoding operations
- Efficiency: Minimize serialized data size through variable-length encoding
- Simplicity: Clear, unambiguous format specification
- Performance: Fast serialization/deserialization with minimal memory allocation
- Compatibility: Consistent cross-platform behavior
- Extensibility: Support for future type additions and format evolution
┌─────────────┬─────────────┬─────────────────┐
│ Version │ Type │ Data │
│ (1 byte) │ (1 byte) │ (variable) │
└─────────────┴─────────────┴─────────────────┘
- Size: 1 byte
- Current Version:
0x00 - Purpose: Format version identification and future compatibility
- Size: 1 byte
- Purpose: Identifies the data type being encoded
- Values: See Type System
| Type ID | Name | Description | Format |
|---|---|---|---|
0x00 |
TypeNull |
Null/nil value | No additional data |
0x01 |
TypeBoolTrue |
Boolean true | No additional data |
0x02 |
TypeBoolFalse |
Boolean false | No additional data |
0x03 |
TypeString |
UTF-8 string | [SizeLen:1][Size:VarInt][Data:Bytes] |
0x04 |
TypeByte |
Single byte (uint8) | [Value:1] |
0x05 |
TypeInt |
Signed integer | [SizeLen:1][Value:VarInt] |
0x06 |
TypeUint |
Unsigned integer | [SizeLen:1][Value:VarInt] |
0x07 |
TypeFloat |
IEEE 754 floating point | [SizeLen:1][Value:Bytes] |
0x08 |
TypeBlob |
Binary data ([]byte) | [SizeLen:1][Size:VarInt][Data:Bytes] |
0x09 |
TypeTimestamp |
64-bit timestamp (ms) | [Timestamp:8] (little-endian) |
0x0A |
TypeUntypedList |
Heterogeneous list | [SizeLen:1][TotalSize:VarInt][Elements:Variable] |
0x0B |
TypeTypedList |
Homogeneous list | [ElementType:1][Count:VarInt][Elements:Variable] |
0x0C |
TypeObject |
Key-value map/object | [SizeLen:1][TotalSize:VarInt][FieldEntries:Variable] |
Bogo uses Go's standard variable-length integer encoding (binary.PutUvarint/binary.Uvarint) for efficient size representation:
Encoding Rules:
- Values 0-127: 1 byte
- Values 128-16383: 2 bytes
- Values 16384-2097151: 3 bytes
- Up to 8 bytes for large values
Structure:
┌─────────────┬─────────────────────────────────┐
│ Size Len │ Size Value │
│ (1 byte) │ (Size Len bytes) │
└─────────────┴─────────────────────────────────┘
- Size Len: Number of bytes needed to store the size (1-8)
- Size Value: Little-endian encoded size using Go's VarInt encoding
- Range: 0 to 2^64-1
Objects use a sophisticated field entry format for efficient encoding and field skipping:
TypeObject + [SizeLen:1][TotalSize:VarInt] + FieldEntries
Each field entry:
[EntrySizeLen:1][EntrySize:VarInt][KeyLen:1][Key:Bytes][Value:EncodedValue]
Benefits:
- Fast field skipping during parsing
- Efficient object traversal
- Self-describing nested structures
- Enables selective field decoding optimization
Purpose: Represents null/nil values
Structure:
┌─────────────┬─────────────┐
│ Version │ TypeNull │
│ 0x00 │ 0x00 │
└─────────────┴─────────────┘
Total Size: 2 bytes
Purpose: Represents boolean values
Structure:
┌─────────────┬─────────────────┐
│ Version │ Bool Type │
│ 0x00 │ 0x01 or 0x02 │
└─────────────┴─────────────────┘
Total Size: 2 bytes
Purpose: Single byte values (uint8)
Structure:
┌─────────────┬─────────────┬─────────────┐
│ Version │ TypeByte │ Value │
│ 0x00 │ 0x04 │ (1 byte) │
└─────────────┴─────────────┴─────────────┘
Total Size: 3 bytes
Purpose: UTF-8 encoded strings
Structure:
┌─────────────┬─────────────┬─────────────────┬─────────────────┐
│ Version │ TypeString │ Length Info │ String Data │
│ 0x00 │ 0x03 │ (VarInt) │ (Length bytes) │
└─────────────┴─────────────┴─────────────────┴─────────────────┘
Example: String "hello"
00 03 01 05 68 65 6C 6C 6F
│ │ │ │ │
│ │ │ │ └── String content "hello"
│ │ │ └──── Length: 5
│ │ └────── Size length: 1 byte needed
│ └────── Type: String
└────── Version
Purpose: Variable-length integer encoding
Structure:
┌─────────────┬─────────────┬─────────────────┬─────────────────┐
│ Version │ Int Type │ Length Info │ Integer Data │
│ 0x00 │ 0x05/0x06 │ (VarInt) │ (Length bytes) │
└─────────────┴─────────────┴─────────────────┴─────────────────┘
Encoding Rules:
- Little-endian byte order
- Minimal byte representation
- Signed integers use two's complement
Purpose: IEEE 754 floating-point numbers
Structure:
┌─────────────┬─────────────┬─────────────────┬─────────────────┐
│ Version │ TypeFloat │ Length Info │ Float Data │
│ 0x00 │ 0x07 │ (VarInt) │ (Length bytes) │
└─────────────┴─────────────┴─────────────────┴─────────────────┘
Encoding:
- Decomposed IEEE 754 representation
- Variable-length based on precision requirements
Purpose: Binary data (byte lists)
Structure:
┌─────────────┬─────────────┬─────────────────┬─────────────────┐
│ Version │ TypeBlob │ Length Info │ Binary Data │
│ 0x00 │ 0x08 │ (VarInt) │ (Length bytes) │
└─────────────┴─────────────┴─────────────────┴─────────────────┘
Purpose: 64-bit millisecond timestamps
Structure:
┌─────────────┬─────────────┬─────────────────────────────┐
│ Version │TypeTimestamp│ Timestamp Value │
│ 0x00 │ 0x09 │ (8 bytes LE) │
└─────────────┴─────────────┴─────────────────────────────┘
Total Size: 10 bytes
Encoding: Little-endian 64-bit signed integer (milliseconds since Unix epoch)
Purpose: Heterogeneous lists with mixed types
Structure:
┌─────────────┬─────────────┬─────────────────┬─────────────────┐
│ Version │TypeUntypedList│ Length Info │ List Data │
│ 0x00 │ 0x0A │ (VarInt) │ (Elements) │
└─────────────┴─────────────┴─────────────────┴─────────────────┘
List Data: Sequence of encoded elements (each with full type headers)
Purpose: Homogeneous lists optimized for same-type elements
Structure:
┌─────────────┬─────────────┬─────────────────┬─────────────┬─────────────────┬─────────────────┐
│ Version │TypeTypedArr │ Length Info │ Element Type│ Count Info │ Elements │
│ 0x00 │ 0x0B │ (VarInt) │ (1 byte) │ (VarInt) │ (Optimized) │
└─────────────┴─────────────┴─────────────────┴─────────────┴─────────────────┴─────────────────┘
Optimization: Elements encoded without individual type headers
Purpose: Key-value maps and structured objects
Structure:
┌─────────────┬─────────────┬─────────────────┬─────────────────┐
│ Version │ TypeObject │ Length Info │ Field Data │
│ 0x00 │ 0x0C │ (VarInt) │ (Fields) │
└─────────────┴─────────────┴─────────────────┴─────────────────┘
Field Entry Format:
┌─────────────────┬─────────────┬─────────────┬─────────────────┐
│ Entry Size │ Key Length │ Key │ Value │
│ (VarInt) │ (1 byte) │ (Key bytes) │ (Encoded) │
└─────────────────┴─────────────┴─────────────┴─────────────────┘
Constraints:
- Key length limited to 255 bytes
- Keys are UTF-8 strings
- Values can be any supported type
Go Code:
data := map[string]any{
"name": "Alice",
"age": 25,
"active": true
}Binary Encoding (hexadecimal):
00 0C 01 1F 01 0C 04 6E 61 6D 65 03 01 05 41 6C 69 63 65
01 07 03 61 67 65 05 01 19 01 09 06 61 63 74 69 76 65 01
Go Code:
numbers := []int{1, 2, 3, 4, 5}Binary Encoding:
00 0B 01 0C 05 01 05 01 01 01 02 01 03 01 04 01 05
Bogo provides both simple functions and configurable encoder/decoder patterns:
Simple API:
// Basic encoding/decoding
data, err := bogo.Encode(value)
value, err := bogo.Decode(data)
// Marshal/Unmarshal aliases
data, err := bogo.Marshal(value)
value, err := bogo.Unmarshal(data)Pattern-Based API:
// Configurable encoder
encoder := bogo.NewEncoder(
bogo.WithMaxDepth(100),
bogo.WithStrictMode(true),
bogo.WithCompactLists(true),
)
data, err := encoder.Encode(value)
// Configurable decoder
decoder := bogo.NewDecoder(
bogo.WithDecoderStrictMode(true),
bogo.WithUnknownTypes(false),
bogo.WithMaxObjectSize(1024*1024),
)
value, err := decoder.Decode(data)
// Statistics collection
statsEncoder := bogo.NewStatsCollector()
data, err := statsEncoder.Encode(value)
stats := statsEncoder.GetStats()- Streaming: Format supports streaming parsers via
EncodeTo/DecodeFrom - Memory: Minimal allocation during encoding/decoding
- CPU: Simple byte operations, no complex transformations
- Statistics: Optional performance monitoring with stats collectors
- Version Mismatch: Parsers must validate version compatibility
- Corruption Detection: Length fields provide basic integrity checking
- Type Safety: Strict type validation required
- New Types: Can be added with new type IDs (0x0D+)
- Version Evolution: Major format changes require version increment
- Backward Compatibility: Older versions should remain parseable
Bogo distinguishes between zero values and null values for all data types:
Zero values are the default values for each data type when no explicit value is provided:
- String:
""(empty string, zero length) - Integer:
0(numeric zero) - Float:
0.0(floating-point zero) - Boolean:
false(boolean false) - Byte:
0x00(zero byte) - Binary: Empty binary data (zero length)
- List: Empty list (zero elements)
- Object: Empty object (zero fields)
- Timestamp: Epoch zero (1970-01-01 00:00:00 UTC)
Encoding: Zero values are encoded using their respective type identifiers with appropriate data representation. Decoding: Zero values decode back to their language-specific zero/default values, preserving type information.
Null values represent the explicit absence of a value:
- Explicit null: Language-specific null/nil/None representation
- Null pointers: Uninitialized or explicitly null references
- Optional types: Unset optional values
Encoding: All null values are encoded as TypeNull (0x00).
Decoding: All null values decode back to the language's null representation.
| Value Type | Example Value | Encoded As | Decodes To |
|---|---|---|---|
| Zero string | "" |
TypeString + 0 length | Empty string |
| Zero integer | 0 |
TypeInt + zero | Numeric zero |
| Zero boolean | false |
TypeBoolFalse | Boolean false |
| Empty list | [] |
TypeUntypedList + 0 elements | Empty list |
| Null value | null |
TypeNull | Language null |
Objects support both zero values and null values as field values:
{
"zero_string": "", // TypeString with zero length
"zero_number": 0, // TypeInt with value zero
"null_value": null // TypeNull
}
This distinction enables tri-state logic for data types:
- True/Present: Actual value (e.g.,
true,"hello",42) - False/Empty: Zero value (e.g.,
false,"",0) - Unknown/Unset: Null value (
null)
This is particularly useful for:
- Optional fields that can be unset (null), empty (zero), or have a value
- Boolean fields that can be true, false, or unknown
- Numeric fields that can have a value, be zero, or be unspecified
- Buffer Bounds: All length fields must be validated
- Resource Limits: Implement reasonable size limits
- Recursion: Limit nesting depth for objects/lists
- Memory: Prevent excessive memory allocation
- Initial specification
- Core type system implementation
- Variable-length encoding
- Object and list support
Note: This specification is subject to change. Implementations should verify version compatibility before processing bogo-encoded data.