Content Blocks

Section: Core Specification Version: 0.1

1. Overview

The content layer represents document content as a tree of semantic blocks. This approach:

Separates meaning from presentation
Enables consistent editing across implementations
Supports reliable machine extraction
Facilitates accessibility

2. Content File

2.1 Location and Format

The content file:

MUST be located at content/document.json (or path specified in manifest)
MUST be valid JSON conforming to RFC 8259
MUST be encoded as UTF-8 without BOM

2.2 Root Structure

{
  "version": "0.1",
  "blocks": [
    { "type": "heading", "level": 1, "children": [...] },
    { "type": "paragraph", "children": [...] },
    ...
  ]
}

Field	Type	Required	Description
`version`	string	Yes	Content model version
`blocks`	array	Yes	Array of block objects

3. Block Model

3.1 Block Object

Every block has the following base structure:

{
  "type": "paragraph",
  "id": "block-123",
  "children": [...],
  "attributes": {}
}

Field	Type	Required	Description
`type`	string	Yes	Block type identifier
`id`	string	No	Unique block identifier within document
`children`	array	Varies	Child nodes (blocks or text)
`attributes`	object	No	Type-specific attributes

3.2 Block Identifiers

Block IDs:

MUST be unique within the document
SHOULD be present on all blocks
MUST be present when the document uses any extension that references content positions (collaboration, phantoms, semantic, presentation extension)
MUST be stable across edits for documents in REVIEW or later states
SHOULD be stable across edits in DRAFT state (for collaboration)
SHOULD use URL-safe characters

Block IDs share the document-wide ID namespace with named anchor IDs (see Anchors and References specification). Block IDs and anchor IDs MUST be unique across both sets.

4. Core Block Types

4.1 Text Leaf Nodes

Text nodes are the leaf nodes that contain actual text content.

{
  "type": "text",
  "value": "Hello, world",
  "marks": ["bold", "italic"]
}

Field	Type	Required	Description
`type`	string	Yes	Always "text"
`value`	string	Yes	Text content
`marks`	array	No	Formatting marks

4.1.1 Standard Marks

Mark	Description
`bold`	Bold/strong text
`italic`	Italic/emphasized text
`underline`	Underlined text
`strikethrough`	Struck-through text
`code`	Inline code (monospace)
`superscript`	Superscript text
`subscript`	Subscript text
`anchor`	Named anchor point (see below and Anchors and References spec)

4.1.1a Anchor Mark

The anchor mark places a named, stable anchor point within text. Anchor marks enable internal references that survive content edits.

{
  "type": "text",
  "value": "key concept",
  "marks": [
    { "type": "anchor", "id": "def-key-concept" }
  ]
}

Field	Type	Required	Description
`type`	string	Yes	Always `"anchor"`
`id`	string	Yes	Unique anchor identifier (shares namespace with block IDs)

See the Anchors and References specification for full details on the anchor system, including Content Anchor URIs, character offset computation, and validation rules.

4.1.2 Link Mark

Links use an extended mark format:

{
  "type": "text",
  "value": "click here",
  "marks": [
    {
      "type": "link",
      "href": "https://example.com",
      "title": "Example Site"
    }
  ]
}

The href field also accepts Content Anchor URIs for internal links. Values beginning with # are interpreted as internal references:

{
  "type": "text",
  "value": "See the introduction",
  "marks": [
    {
      "type": "link",
      "href": "#intro",
      "title": "Introduction"
    }
  ]
}

See the Anchors and References specification for the full Content Anchor URI syntax.

4.1.3 Math Mark

Inline mathematical expressions use the math mark:

{
  "type": "text",
  "value": "7.677(9)×10²",
  "marks": [
    {
      "type": "math",
      "format": "latex",
      "source": "7.677(9) \\times 10^{2}"
    }
  ]
}

Field	Type	Required	Description
`type`	string	Yes	Always `"math"`
`format`	string	Yes	`"latex"` or `"mathml"`
`source`	string	Yes	Math content in specified format

The text node's value field contains the display/fallback text (used for plain text rendering and accessibility). The mark's source field contains the mathematical notation for proper typesetting.

This mark enables inline math without breaking text flow, unlike the block-level math type which creates a separate block element.

4.2 Paragraph

Standard paragraph block.

{
  "type": "paragraph",
  "children": [
    { "type": "text", "value": "This is a paragraph with " },
    { "type": "text", "value": "bold", "marks": ["bold"] },
    { "type": "text", "value": " text." }
  ]
}

Children: Text nodes only

4.3 Heading

Section heading with level.

{
  "type": "heading",
  "level": 2,
  "children": [
    { "type": "text", "value": "Section Title" }
  ]
}

Attribute	Type	Required	Description
`level`	integer	Yes	Heading level (1-6)

Children: Text nodes only

4.4 List

Ordered or unordered list.

{
  "type": "list",
  "ordered": true,
  "start": 1,
  "children": [
    {
      "type": "listItem",
      "children": [
        { "type": "paragraph", "children": [...] }
      ]
    }
  ]
}

Attribute	Type	Required	Description
`ordered`	boolean	Yes	Whether list is ordered
`start`	integer	No	Starting number (ordered lists only)

Children: listItem blocks only

4.5 List Item

Individual list item (only valid as child of list).

{
  "type": "listItem",
  "checked": null,
  "children": [
    { "type": "paragraph", "children": [...] }
  ]
}

Attribute	Type	Required	Description
`checked`	boolean\|null	No	Checkbox state (null = not a checkbox)

Children: Paragraph, list, or other block-level content

4.6 Blockquote

Quoted content block.

{
  "type": "blockquote",
  "children": [
    { "type": "paragraph", "children": [...] }
  ]
}

Children: Paragraph or other block-level content

4.7 Code Block

Block of source code or preformatted text.

{
  "type": "codeBlock",
  "language": "javascript",
  "children": [
    { "type": "text", "value": "function hello() {\n  return 'world';\n}" }
  ]
}

Attribute	Type	Required	Description
`language`	string	No	Programming language identifier
`highlighting`	string	No	Highlighting mode: `"none"` or `"tokens"`
`tokens`	array	No	Pre-tokenized highlighting (when `highlighting="tokens"`)

Children: Single text node (no marks)

Language identifiers SHOULD follow common conventions (e.g., "javascript", "python", "rust").

4.7.1 Syntax Highlighting

For documents requiring stable, portable syntax highlighting, code blocks can include pre-tokenized content:

{
  "type": "codeBlock",
  "language": "python",
  "highlighting": "tokens",
  "tokens": [
    { "type": "keyword", "value": "def " },
    { "type": "function", "value": "hello" },
    { "type": "punctuation", "value": "():" },
    { "type": "plain", "value": "\n    " },
    { "type": "keyword", "value": "return " },
    { "type": "string", "value": "\"world\"" }
  ],
  "children": [
    { "type": "text", "value": "def hello():\n    return \"world\"" }
  ]
}

Token Types:

Token Type	Description
`keyword`	Language keywords (if, for, def, class, etc.)
`function`	Function names
`class`	Class names
`variable`	Variable names
`parameter`	Function parameters
`string`	String literals
`number`	Numeric literals
`boolean`	Boolean literals
`null`	Null/nil/None values
`comment`	Code comments
`docstring`	Documentation strings
`operator`	Operators (+, -, *, etc.)
`punctuation`	Punctuation marks
`delimiter`	Delimiters (braces, brackets, etc.)
`type`	Type annotations
`namespace`	Namespace/module identifiers
`decorator`	Decorators/annotations
`plain`	Plain text (default/fallback)

Behavior:

If highlighting is absent or "none", renderers use the children text node (current behavior)
If highlighting is "tokens", renderers use the tokens array for colored output
Renderers MAY re-highlight from children if they don't support the tokens format
The children field MUST always contain the complete source code for fallback and accessibility

4.8 Horizontal Rule

Thematic break between sections.

{
  "type": "horizontalRule"
}

Children: None (void element)

4.9 Image

Embedded or referenced image.

{
  "type": "image",
  "src": "assets/images/figure1.png",
  "alt": "Diagram showing system architecture",
  "title": "Figure 1: System Architecture",
  "width": 800,
  "height": 600
}

Attribute	Type	Required	Description
`src`	string	Yes	Image source (path or URL)
`alt`	string	Yes	Alternative text (accessibility)
`title`	string	No	Image title/caption
`width`	integer	No	Intrinsic width in pixels
`height`	integer	No	Intrinsic height in pixels

Children: None (void element)

For embedded images, src MUST be a relative path within the archive (e.g., assets/images/figure1.png).

4.10 Table

Tabular data.

{
  "type": "table",
  "children": [
    {
      "type": "tableRow",
      "header": true,
      "children": [
        { "type": "tableCell", "children": [
          { "type": "paragraph", "children": [{ "type": "text", "value": "Name" }] }
        ]},
        { "type": "tableCell", "children": [
          { "type": "paragraph", "children": [{ "type": "text", "value": "Value" }] }
        ]}
      ]
    },
    {
      "type": "tableRow",
      "children": [
        { "type": "tableCell", "children": [
          { "type": "paragraph", "children": [{ "type": "text", "value": "Foo" }] }
        ]},
        { "type": "tableCell", "children": [
          { "type": "paragraph", "children": [{ "type": "text", "value": "42" }] }
        ]}
      ]
    }
  ]
}

Children: tableRow blocks only

4.11 Table Row

Row within a table.

{
  "type": "tableRow",
  "header": false,
  "children": [...]
}

Attribute	Type	Required	Description
`header`	boolean	No	Whether row is a header row

Children: tableCell blocks only

4.12 Table Cell

Cell within a table row.

{
  "type": "tableCell",
  "colspan": 1,
  "rowspan": 1,
  "align": "left",
  "children": [...]
}

Attribute	Type	Required	Description
`colspan`	integer	No	Number of columns to span (default: 1)
`rowspan`	integer	No	Number of rows to span (default: 1)
`align`	string	No	Text alignment: "left", "center", "right"

Children: Block-level content (typically paragraph blocks)

4.12.1 Simplified Cell Content

For simple cells containing only plain or formatted text, implementations MAY accept text nodes directly as children (without wrapping in a paragraph block). This is a shorthand for common cases:

Standard form (block-level children):

{
  "type": "tableCell",
  "children": [
    { "type": "paragraph", "children": [{ "type": "text", "value": "Name" }] }
  ]
}

Simplified form (text nodes directly):

{
  "type": "tableCell",
  "children": [
    { "type": "text", "value": "Name" }
  ]
}

Implementations MUST support the standard form. Support for the simplified form is OPTIONAL for readers and SHOULD NOT be used when generating documents intended for maximum compatibility.

4.13 Math

Mathematical content using MathML or LaTeX.

{
  "type": "math",
  "display": true,
  "format": "latex",
  "value": "E = mc^2"
}

Attribute	Type	Required	Description
`display`	boolean	Yes	Display mode (true) vs inline (false)
`format`	string	Yes	`"latex"` or `"mathml"`
`value`	string	Yes	Math content in specified format

Format details:

"latex" — LaTeX math mode using amsmath package conventions. Content is the expression only (no $ or \begin{equation} delimiters). Example: "E = mc^2", "\\frac{1}{2}", "\\sum_{i=1}^{n} x_i"
"mathml" — MathML 3.0 (W3C Recommendation). Content is the <math> element body

Children: None (content in value attribute)

4.14 Break

Line break within a block.

{
  "type": "break"
}

Children: None (void element)

Used for hard/forced line breaks within paragraphs.

4.14.1 Inline Breaks vs Block Breaks

Text nodes MAY contain newline characters (\n) for soft line breaks. These represent inline breaks where the text flow continues but a line break is rendered. Soft breaks are typically used for poetry, addresses, or other content where line breaks are semantically meaningful but not paragraph separators.

The break block represents a hard break — a forced line break that interrupts text flow. This is equivalent to HTML's <br> element.

Type	Representation	Use Case
Soft break	`\n` in text value	Poetry lines, addresses, natural line continuation
Hard break	`{ "type": "break" }`	Forced breaks in structured content
Paragraph break	New paragraph block	Semantic paragraph separation

Example with soft breaks:

{
  "type": "paragraph",
  "children": [
    { "type": "text", "value": "Roses are red,\nViolets are blue" }
  ]
}

4.15 Definition List

Definition lists represent term-description pairs, commonly used for glossaries, metadata displays, and key-value content.

{
  "type": "definitionList",
  "children": [
    {
      "type": "definitionItem",
      "children": [
        {
          "type": "definitionTerm",
          "children": [{ "type": "text", "value": "Nagasa" }]
        },
        {
          "type": "definitionDescription",
          "children": [
            {
              "type": "paragraph",
              "children": [{ "type": "text", "value": "Blade length measured from mune-machi to kissaki." }]
            }
          ]
        }
      ]
    },
    {
      "type": "definitionItem",
      "children": [
        {
          "type": "definitionTerm",
          "children": [{ "type": "text", "value": "Sori" }]
        },
        {
          "type": "definitionDescription",
          "children": [
            {
              "type": "paragraph",
              "children": [{ "type": "text", "value": "Curvature of the blade." }]
            }
          ]
        }
      ]
    }
  ]
}

Block Type	Children	Description
`definitionList`	`definitionItem` only	Container for definition items
`definitionItem`	`definitionTerm`, `definitionDescription`	Groups a term with its description(s)
`definitionTerm`	Text nodes only	The term being defined
`definitionDescription`	Block-level content	The definition or description

A definitionItem MUST contain at least one definitionTerm and at least one definitionDescription. Multiple terms MAY share a single description, and a single term MAY have multiple descriptions.

4.16 Measurement

Semantic representation of a measurement with optional uncertainty and units. Used for scientific, engineering, and metrology documents.

{
  "type": "measurement",
  "value": 7.677,
  "uncertainty": 0.009,
  "uncertaintyNotation": "parenthetical",
  "exponent": 2,
  "unit": "mm",
  "display": "7.677(9)×10² mm"
}

Attribute	Type	Required	Description
`value`	number	Yes	Numeric value
`uncertainty`	number	No	Measurement uncertainty
`uncertaintyNotation`	string	No	How uncertainty is displayed
`exponent`	integer	No	Power of 10 for scientific notation
`unit`	string	No	Unit of measurement
`display`	string	Yes	Human-readable display string

4.16.1 Uncertainty Notation

Value	Display Format	Example
`parenthetical`	Uncertainty in last digit(s)	7.677(9)
`plusminus`	Plus-minus format	7.677 ± 0.009
`range`	Range format	7.668–7.686
`percent`	Percentage	7.677 ± 0.12%

The display field is REQUIRED for accessibility and fallback rendering. It SHOULD accurately represent the measurement as intended for human readers.

Children: None (void element)

Note: For complex expressions like vectors, matrices, or multi-variable measurements, use the math block type with LaTeX or MathML.

4.17 Signature

Semantic representation of a signature with optional image, signer information, and timestamp.

{
  "type": "signature",
  "id": "sig-metrologist",
  "signatureType": "handwritten",
  "image": "assets/signatures/metrologist.png",
  "signer": {
    "name": "Dr. Jane Smith",
    "title": "Chief Metrologist",
    "organization": "BayKen Metrology"
  },
  "timestamp": "2026-01-29T14:15:28Z",
  "purpose": "certification",
  "digitalSignatureRef": "security/signatures.json#sig-metrologist"
}

Attribute	Type	Required	Description
`signatureType`	string	Yes	Type of signature
`image`	string	No	Path to signature image
`signer`	object	No	Signer identity information
`timestamp`	string	No	ISO 8601 timestamp
`purpose`	string	No	Purpose of the signature
`digitalSignatureRef`	string	No	Reference to cryptographic signature

4.17.1 Signature Types

Value	Description
`handwritten`	Scanned or drawn handwritten signature
`digital`	Cryptographic digital signature
`electronic`	Typed name or simple electronic mark
`stamp`	Official seal or stamp image

4.17.2 Signature Purpose

Value	Description
`certification`	Certifies document accuracy or compliance
`approval`	Approves content or action
`witness`	Witnesses another signature or event
`acknowledgment`	Acknowledges receipt or understanding
`authorship`	Indicates document authorship

The signer object MAY contain:

name (string) - Signer's full name
title (string) - Professional title
organization (string) - Organization name
email (string) - Contact email
id (string) - Unique identifier

Children: None (void element)

4.18 SVG

Scalable Vector Graphics for charts, diagrams, and illustrations.

{
  "type": "svg",
  "src": "assets/graphics/chart.svg",
  "alt": "Mass vs Arc Length chart showing linear relationship",
  "title": "Figure 1: Mass/Arc Length Distribution",
  "width": 400,
  "height": 300
}

Alternative inline form:

{
  "type": "svg",
  "content": "<svg xmlns=\"http://www.w3.org/2000/svg\" viewBox=\"0 0 100 100\">...</svg>",
  "alt": "Simple diagram"
}

Attribute	Type	Required	Description
`src`	string	No*	Path to SVG file
`content`	string	No*	Inline SVG content
`alt`	string	Yes	Alternative text (accessibility)
`title`	string	No	Title/caption
`width`	integer	No	Display width in pixels
`height`	integer	No	Display height in pixels

*Either src or content MUST be provided, but not both.

For src, the path MUST be a relative path within the archive (e.g., assets/graphics/chart.svg) or a URL.

For content, the value MUST be a complete SVG element including the xmlns attribute.

Children: None (void element)

4.19 Barcode

Semantic barcode representation including QR codes, DataMatrix, and linear barcodes.

{
  "type": "barcode",
  "format": "qr",
  "data": "https://example.com/verify/abc123",
  "alt": "Verification QR code linking to certificate validation page",
  "errorCorrection": "M",
  "size": 100
}

Attribute	Type	Required	Description
`format`	string	Yes	Barcode format
`data`	string	Yes	Encoded content
`alt`	string	Yes	Alternative text description
`errorCorrection`	string	No	Error correction level (QR/DataMatrix)
`size`	integer	No	Rendered size in pixels
`quietZone`	integer	No	Quiet zone size in modules

4.19.1 Barcode Formats

Format	Description
`qr`	QR Code (2D matrix)
`datamatrix`	Data Matrix (2D matrix)
`code128`	Code 128 (linear, alphanumeric)
`code39`	Code 39 (linear, alphanumeric)
`ean13`	EAN-13 (linear, numeric)
`ean8`	EAN-8 (linear, numeric)
`upca`	UPC-A (linear, numeric)
`pdf417`	PDF417 (2D stacked linear)

4.19.2 Error Correction Levels

For QR codes and DataMatrix:

Level	Recovery Capacity
`L`	~7%
`M`	~15% (default)
`Q`	~25%
`H`	~30%

The data field contains the raw content to encode. For URLs, use the full URL. For structured data, implementations SHOULD encode appropriately for the format.

The alt field MUST provide a meaningful description of what the barcode represents, not just "QR code" — include the purpose and destination.

Children: None (void element)

4.20 Figure

Container for figures with optional captions. Figures group visual content (images, SVGs, tables, charts) with their captions for semantic association and automatic numbering.

{
  "type": "figure",
  "id": "fig-mass-arc",
  "children": [
    {
      "type": "image",
      "src": "assets/images/chart.png",
      "alt": "Line chart showing mass per arc length"
    },
    {
      "type": "figcaption",
      "children": [
        { "type": "text", "value": "Figure 1: Mass / Arc Length Distribution" }
      ]
    }
  ],
  "numbering": "auto"
}

Attribute	Type	Required	Description
`numbering`	string or integer	No	Figure numbering mode
`subfigures`	array	No	Array of subfigure objects (alternative to children)

4.20.1 Numbering Values

Value	Description
`auto`	Automatically number in document order
`none`	No numbering
(integer)	Explicit figure number

Children: The figure block MAY contain either:

Standard figure content:
- Exactly one content block: image, svg, table, math, or barcode
- Zero or one figcaption block
Subfigures array (for multi-panel figures)

The figcaption block MAY appear before or after the content block.

4.20.2 Subfigures

For multi-panel figures with individual sub-images and captions, use the subfigures array:

{
  "type": "figure",
  "id": "fig-comparison",
  "subfigures": [
    {
      "id": "fig-comparison-a",
      "label": "a",
      "children": [
        {
          "type": "image",
          "src": "assets/images/before.png",
          "alt": "Tissue sample before treatment"
        }
      ],
      "caption": "Before treatment"
    },
    {
      "id": "fig-comparison-b",
      "label": "b",
      "children": [
        {
          "type": "image",
          "src": "assets/images/after.png",
          "alt": "Tissue sample after treatment"
        }
      ],
      "caption": "After treatment"
    }
  ],
  "caption": [
    { "type": "text", "value": "Comparison of tissue samples showing treatment effects" }
  ],
  "numbering": "auto"
}

Subfigure Field	Type	Required	Description
`id`	string	No	Unique subfigure identifier for cross-referencing
`label`	string	No	Subfigure label (e.g., "a", "b", "i", "ii")
`children`	array	Yes	Content block(s) for this subfigure
`caption`	string or array	No	Subfigure-specific caption

When using subfigures, the top-level caption field (instead of a figcaption child) provides the overall figure caption. This can be a simple string or an array of text nodes for rich formatting.

Rendering: Subfigure labels are typically rendered in parentheses (e.g., "(a)", "(b)") adjacent to or below each sub-image. The overall caption references subfigures by their labels (e.g., "Figure 1: (a) Before treatment. (b) After treatment.").

4.21 Figure Caption

Caption for a figure. Only valid as a child of figure.

{
  "type": "figcaption",
  "children": [
    { "type": "text", "value": "Figure 1: System Architecture Overview" }
  ]
}

Children: Text nodes or inline content (including links, marks)

4.22 Admonition

Callout boxes for notes, warnings, tips, and other highlighted content commonly used in technical documentation, textbooks, and tutorials.

{
  "type": "admonition",
  "variant": "warning",
  "title": "Caution",
  "children": [
    {
      "type": "paragraph",
      "children": [
        { "type": "text", "value": "High voltage. Disconnect power before servicing." }
      ]
    }
  ]
}

Attribute	Type	Required	Description
`variant`	string	Yes	Admonition type (see below)
`title`	string	No	Custom title (defaults to variant label)

4.22.1 Admonition Variants

Variant	Typical Use
`note`	General information or supplementary content
`tip`	Helpful suggestions or best practices
`info`	Additional context or background information
`warning`	Potential issues or important considerations
`caution`	Actions that may cause problems
`danger`	Critical warnings about harmful actions
`important`	Key information that should not be missed
`example`	Illustrative examples or demonstrations

Children: Block-level content (typically paragraph blocks)

The title field is OPTIONAL. When omitted, implementations SHOULD use a localized label based on the variant (e.g., "Note", "Warning", "Tip").

5. Extension Block Types

Extensions MAY define additional block types. Extension blocks:

MUST use a namespaced type (e.g., "forms:input")
SHOULD define fallback rendering for non-supporting implementations
MUST be documented in the extension specification

Example:

{
  "type": "forms:textInput",
  "name": "email",
  "label": "Email Address",
  "required": true,
  "fallback": {
    "type": "paragraph",
    "children": [{ "type": "text", "value": "[Email input field]" }]
  }
}

5.1 Extension Mark Types

Extensions MAY define additional mark types for use within text nodes. Unlike block types, extension marks are NOT required to use a namespace prefix — marks operate within the text node's marks array where the collision risk is lower. However, extensions MAY use a namespace prefix (e.g., legal:cite) when the unqualified name could cause confusion with marks from other extensions.

Extension marks:

SHOULD be documented in the extension specification
MUST define their field table (type, required fields, optional fields)
SHOULD avoid names that collide with core marks or marks from commonly-paired extensions

6. Internationalization

6.1 Text Direction and Writing Mode

Blocks MAY specify text direction and writing mode:

{
  "type": "paragraph",
  "attributes": {
    "dir": "rtl",
    "lang": "ar"
  },
  "children": [...]
}

{
  "type": "paragraph",
  "attributes": {
    "lang": "ja",
    "writingMode": "vertical-rl"
  },
  "children": [...]
}

Attribute	Type	Description
`dir`	string	Text direction: "ltr", "rtl", "auto"
`lang`	string	BCP 47 language tag
`writingMode`	string	Text flow direction (see below)

6.1.1 Writing Mode Values

Value	Description	Use Case
`horizontal-tb`	Horizontal text, top-to-bottom block flow (default)	Latin, Cyrillic, Arabic, Hebrew
`vertical-rl`	Vertical text, right-to-left column flow	Traditional Chinese, Japanese, Korean
`vertical-lr`	Vertical text, left-to-right column flow	Mongolian
`sideways-rl`	Text rotated 90° clockwise	Rotated labels
`sideways-lr`	Text rotated 90° counter-clockwise	Rotated labels

The writingMode attribute indicates the semantic writing mode of the content. For CJK text that was authored in vertical mode, this attribute preserves authorial intent even when presentation may vary.

Relationship to Presentation: The presentation layer (see Presentation Layers spec, Section 5.1.1) may override the visual rendering of writing mode. The content-level writingMode attribute represents the source/semantic writing direction, while the presentation-level writingMode style controls actual rendering.

6.2 Unicode

All text content MUST be valid Unicode (UTF-8 encoded in JSON). Implementations MUST support:

All Unicode normalization forms
Bidirectional text
Combining characters
Emoji

7. Validation

7.1 Structural Rules

Blocks MUST have a valid type
Children MUST be appropriate for the block type
Required attributes MUST be present
Text nodes MUST have non-null value

7.2 Content Rules

Paragraph children MUST be text nodes
Heading children MUST be text nodes
List children MUST be listItem blocks
Table children MUST be tableRow blocks
TableRow children MUST be tableCell blocks
TableCell children MUST be block-level content (typically paragraph blocks)
DefinitionList children MUST be definitionItem blocks
DefinitionItem children MUST include at least one definitionTerm and one definitionDescription
Figure children MUST include exactly one content block and zero or one figcaption (or use subfigures array)
Figcaption is only valid as a child of figure or subfigure
Admonition children MUST be block-level content (typically paragraph blocks)

7.3 Cross-References

Image src attributes that reference embedded assets MUST correspond to files in the assets directory.

8. Processing Model

8.1 Parsing

Parse JSON
Validate root structure
Recursively validate blocks
Resolve asset references

8.2 Serialization

Construct block tree
Assign IDs if needed
Serialize to JSON
Compute hash for manifest

8.3 Normalization

Before comparison or hashing, content SHOULD be normalized:

Sort object keys alphabetically
Remove empty attributes
Collapse adjacent text nodes with same marks
Normalize Unicode to NFC form

9. Examples

9.1 Simple Document

{
  "version": "0.1",
  "blocks": [
    {
      "type": "heading",
      "level": 1,
      "children": [{ "type": "text", "value": "Introduction" }]
    },
    {
      "type": "paragraph",
      "children": [
        { "type": "text", "value": "This document describes the " },
        { "type": "text", "value": "Codex", "marks": ["bold"] },
        { "type": "text", "value": " format." }
      ]
    },
    {
      "type": "list",
      "ordered": false,
      "children": [
        {
          "type": "listItem",
          "children": [
            {
              "type": "paragraph",
              "children": [{ "type": "text", "value": "Semantic-first design" }]
            }
          ]
        },
        {
          "type": "listItem",
          "children": [
            {
              "type": "paragraph",
              "children": [{ "type": "text", "value": "Modern security" }]
            }
          ]
        }
      ]
    }
  ]
}

9.2 Document with Table and Image

{
  "version": "0.1",
  "blocks": [
    {
      "type": "heading",
      "level": 1,
      "children": [{ "type": "text", "value": "Results" }]
    },
    {
      "type": "image",
      "src": "assets/images/chart.png",
      "alt": "Bar chart showing quarterly results",
      "title": "Figure 1: Quarterly Results"
    },
    {
      "type": "table",
      "children": [
        {
          "type": "tableRow",
          "header": true,
          "children": [
            { "type": "tableCell", "children": [
              { "type": "paragraph", "children": [{ "type": "text", "value": "Quarter" }] }
            ]},
            { "type": "tableCell", "children": [
              { "type": "paragraph", "children": [{ "type": "text", "value": "Revenue" }] }
            ]}
          ]
        },
        {
          "type": "tableRow",
          "children": [
            { "type": "tableCell", "children": [
              { "type": "paragraph", "children": [{ "type": "text", "value": "Q1" }] }
            ]},
            { "type": "tableCell", "children": [
              { "type": "paragraph", "children": [{ "type": "text", "value": "$1.2M" }] }
            ]}
          ]
        }
      ]
    }
  ]
}

Uh oh!

FilesExpand file tree

03-content-blocks.md

Latest commit

History

03-content-blocks.md

File metadata and controls

Content Blocks

1. Overview

2. Content File

2.1 Location and Format

2.2 Root Structure

3. Block Model

3.1 Block Object

3.2 Block Identifiers

4. Core Block Types

4.1 Text Leaf Nodes

4.1.1 Standard Marks

4.1.1a Anchor Mark

4.1.2 Link Mark

4.1.3 Math Mark

4.2 Paragraph

4.3 Heading

4.4 List

4.5 List Item

4.6 Blockquote

4.7 Code Block

4.7.1 Syntax Highlighting

4.8 Horizontal Rule

4.9 Image

4.10 Table

4.11 Table Row

4.12 Table Cell

4.12.1 Simplified Cell Content

4.13 Math

4.14 Break

4.14.1 Inline Breaks vs Block Breaks

4.15 Definition List

4.16 Measurement

4.16.1 Uncertainty Notation

4.17 Signature

4.17.1 Signature Types

4.17.2 Signature Purpose

4.18 SVG

4.19 Barcode

4.19.1 Barcode Formats

4.19.2 Error Correction Levels

4.20 Figure

4.20.1 Numbering Values

4.20.2 Subfigures

4.21 Figure Caption

4.22 Admonition

4.22.1 Admonition Variants

5. Extension Block Types

5.1 Extension Mark Types

6. Internationalization

6.1 Text Direction and Writing Mode

6.1.1 Writing Mode Values

6.2 Unicode

7. Validation

7.1 Structural Rules

7.2 Content Rules

7.3 Cross-References

8. Processing Model

8.1 Parsing

8.2 Serialization

8.3 Normalization

9. Examples

9.1 Simple Document

9.2 Document with Table and Image