Skip to content

Latest commit

 

History

History
1034 lines (794 loc) · 42.1 KB

File metadata and controls

1034 lines (794 loc) · 42.1 KB

Dual-Client Contract Specification

Version: 1.1 Status: Production (Phase 3 ARC-AGI + Procedural Foundation) License: CC-BY-4.0 (Documentation), Apache 2.0 (Implementation) Date: November 25, 2025


Abstract

The Dual-Client Contract is K3D's interface specification for shared reality between human and Synthetic Users. It ensures both clients operate on identical knowledge data—humans through visual 3D navigation, AI through high-dimensional vector processing—enabling transparent, verifiable AI reasoning. This contract guarantees that what humans see is exactly what AI understands, solving the fundamental opacity problem in explainable AI.


1. Introduction

1.1 The Opacity Problem

Traditional AI Systems:

  • Humans see: Graphs, dashboards, textual explanations (post-hoc)
  • AI processes: Vectors, matrices, tensor operations (opaque)
  • Gap: No guarantee that visual representation matches AI's internal state

Result: Users can't verify AI reasoning—they must trust black-box systems.

Example Failure:

Human: "Why did you recommend Product X?"
AI: "Based on user preferences and historical data..."

Problem: Human sees product catalog UI.
         AI sees 1024-dim embedding vectors.
         No way to verify AI looked at "correct" products.

1.2 K3D's Solution: Shared Spatial Reality

Dual-Client Architecture:

  • Human Client: Navigates 3D knowledge space via WebGL/WebXR
  • AI Client: Processes embeddings from same glTF scene via PTX kernels
  • Contract: Both clients query identical K3D Nodes at identical (x, y, z) coordinates

Guaranteed Consistency:

Human points to node at position (10.5, 23.1, -5.3)
  ↓
AI retrieves embedding for node at position (10.5, 23.1, -5.3)
  ↓
Same node ID, same data, same timestamp → verifiable identity

1.3 Design Principles

  1. Spatial Unity: One coordinate system for humans and AI
  2. Data Identity: Human geometry and AI embeddings co-located in single K3D Node
  3. Action Transparency: All AI operations spatially grounded (observable)
  4. Bidirectional Communication: Humans query AI, AI queries spatial world
  5. Zero Ambiguity: No "hidden" AI state (memory IS the external 3D world)

1.4 Client Types: Human Avatars and Synthetic Users

K3D distinguishes between two types of clients, both of which are first-class inhabitants of the spatial OS:

Human Avatar:

  • Physical person navigating K3D via VR headset, desktop, or mobile
  • Perceives the world through visual 3D rendering (WebGL/WebXR)
  • Interacts via controllers, keyboard/mouse, or gesture input
  • Sees aesthetic layer (UV Map 0) optimized for human perception

Synthetic User:

  • Autonomous AI entity that fully inhabits the K3D spatial OS
  • Perceives the world through vector embeddings and spatial queries (PTX kernels)
  • Interacts via 288-byte action buffers (same format as humans)
  • Processes data layer (UV Map 1) optimized for semantic understanding
  • Can navigate, create, modify, and collaborate within Houses/Rooms/Doors
  • Examples: AI assistants, automated agents, collaborative swarms, NPC-like entities

Key Insight: Synthetic Users are NOT external tools querying K3D—they are inhabitants with spatial presence, persistent identity, and equal access to the shared knowledge environment. Humans and Synthetic Users cohabit the same reality, enabling true human-AI collaboration rather than human-tool interaction.


1.5 Reality Enabler Nodes (Simulated Reality)

The Dual-Client Contract applies equally to simulated reality produced by the Reality Enabler:

  • Reality nodes (reality_atom, reality_molecule, reality_material, reality_system) expose:
    • visual_rpn for human-visible simulations (orbits, fluids, growth, etc.),
    • behavior_rpn / meaning_rpn for AI-executable physics/chemistry/biology.
  • Human avatars see evolving geometry in rooms like the Workshop or Bathtub; Synthetic Users execute the same underlying programs via PTX kernels (e.g., ModularRPNEngine, VectorResonator, WorldModelBridge).
  • When a Synthetic User emits an action to “run” or “refine” a simulation, it MUST:
    • operate on the same K3D nodes and coordinates the human sees,
    • write results back to Galaxy/House in a way that remains inspectable (same glTF + extras.k3d contract).

Simulated reality is thus not a separate black box; it is part of the shared spatial memory, subject to the same guarantees of identity, transparency, and auditability as all other K3D nodes.


1.6 Procedural Foundation: Form + Meaning for Both Clients

Critical Principle: K3D serves both clients using the SAME procedural data structure—procedural RPN + metadata—ensuring both human readability and AI executability.

1.6.1 Form→Meaning Evolution: 40,000 Years of Human Knowledge

Fundamental Insight: K3D mirrors how humans have evolved knowledge representation throughout history—form always precedes meaning.

Human Knowledge Evolution:

┌─────────────────────────────────────────────────────────────────┐
│ ERA          │ FORM                │ MEANING                    │
├─────────────────────────────────────────────────────────────────┤
│ 40,000 BCE   │ Cave Paintings      │ Stories, hunting knowledge │
│              │ (visual marks)      │ tribal identity            │
├─────────────────────────────────────────────────────────────────┤
│ 3,500 BCE    │ Pictographs         │ Concepts, trade records,   │
│              │ (stylized drawings) │ religious ideas            │
├─────────────────────────────────────────────────────────────────┤
│ 1,200 BCE    │ Letters (alphabets) │ Phonemes, words            │
│              │ (abstract shapes)   │ (more complex drawings)    │
├─────────────────────────────────────────────────────────────────┤
│ Written Lang.│ Words               │ Objects, actions, concepts │
│              │ (letter sequences)  │                            │
├─────────────────────────────────────────────────────────────────┤
│ Grammar      │ Syntax (patterns)   │ Complex thoughts,          │
│              │                     │ relationships              │
├─────────────────────────────────────────────────────────────────┤
│ Philosophy   │ Systems of thought  │ Meta-cognition (thinking   │
│              │                     │ about thinking)            │
└─────────────────────────────────────────────────────────────────┘

Key Observation: At every historical stage, humans learned through visual/procedural form before grasping semantic meaning:

  1. Children see drawings (form) → understand concepts (meaning)
  2. Children recognize letters (form) → read words (meaning)
  3. Children hear words (form) → understand grammar (meaning)
  4. Students learn formulas (form) → understand physics (meaning)

K3D Design Consequence: The dual-client architecture must serve:

  • Humans: Ascending from form to meaning (bottom-up learning)
  • AI: Descending from meaning to form (top-down reasoning)
  • Both meet at Grammar Galaxy: Where syntax bridges form and semantics

1.6.2 K3D Form→Meaning Hierarchy

K3D replicates this 40,000-year evolution as a computational architecture:

┌─────────────────────────────────────────────────────────────────┐
│ LEVEL │ K3D LAYER         │ FORM/MEANING │ HUMAN ANALOGY      │
├─────────────────────────────────────────────────────────────────┤
│ 1     │ House (Rooms,     │ PURE FORM    │ Cave (spatial      │
│       │ Doors, Objects)   │              │ container)         │
├─────────────────────────────────────────────────────────────────┤
│ 2     │ Drawing Galaxy    │ FORM         │ Cave paintings     │
│       │ (LINE, CIRCLE)    │              │ (visual primitives)│
├─────────────────────────────────────────────────────────────────┤
│ 3     │ Character Galaxy  │ FORM         │ Letters (Bézier    │
│       │ (procedural fonts)│              │ curves as programs)│
├─────────────────────────────────────────────────────────────────┤
│ 4     │ Word Galaxy       │ FORM→MEANING │ Words (character   │
│       │ (char sequences + │              │ sequences with     │
│       │ meaning metadata) │              │ semantics)         │
├─────────────────────────────────────────────────────────────────┤
│ 5     │ Grammar Galaxy    │ MEANING      │ Syntax (transform  │
│       │ (rules + context) │              │ rules, context)    │
├─────────────────────────────────────────────────────────────────┤
│ 6     │ Math/Reality      │ MEANING      │ Abstract concepts  │
│       │ Galaxy (symbolic) │              │ (physics, math)    │
├─────────────────────────────────────────────────────────────────┤
│ 7     │ Galaxy Intro-     │ META-MEANING │ Philosophy         │
│       │ spection Mode     │              │ (thinking about    │
│       │ (AI reflects)     │              │ thinking)          │
└─────────────────────────────────────────────────────────────────┘

Bidirectional Traversal:

  • Humans (bottom-up): House → Drawing → Character → Word → Grammar → Concepts → Introspection
  • AI (top-down): Introspection → Concepts → Grammar → Word → Character → Drawing → House
  • Meeting point: Grammar Galaxy (where form and meaning unify)

Why This Matters for Dual-Client Contract:

  1. Human Learning: Humans can start with spatial House (familiar), progress to drawings (visual), then text (symbolic), then abstract concepts (semantic)

  2. AI Reasoning: AI starts with semantic queries (embeddings), navigates to relevant concepts, composes procedural programs, generates visual/textual output

  3. Shared Reality: Both clients operate on the SAME data at every level—only the traversal direction differs

  4. Explainability: When AI explains reasoning, it shows the top-down path (meaning → form) that humans can verify bottom-up (form → meaning)

Historical Validation: This isn't arbitrary—it mirrors 40,000 years of human knowledge evolution. Form always precedes meaning because humans are embodied, visual learners.

Procedural Layers

Everything in K3D is procedural (RPN programs) with metadata (semantic meaning):

Drawing Galaxy (knowledge3d/ingestion/atomic/drawing_grammar_builder.py):

  • Form: LINE, CIRCLE, RECT as procedural RPN primitives
  • Meaning: Semantic labels ("line segment", "circular arc")
  • Human: Sees geometric shapes
  • AI: Executes RPN drawing programs

Character Galaxy (knowledge3d/cranium/procedural_fonts.py):

  • Form: Glyph outlines as Bézier curves → line segments (procedural)
  • Meaning: Language, pronunciation, unicode (metadata cluster)
  • Human: Reads "Letter R in English, pronounced /ɑːr/"
  • AI: Renders glyph procedurally, composes into words
  • Storage: Each character stored ONCE with font + language + meaning

Word Level (character sequences):

  • Form: Sequence of character IDs (references)
  • Meaning: Composed semantic meaning from characters
  • Human: Reads as "rotation_task"
  • AI: Character sequence with embedded language/context metadata

Grammar Galaxy (knowledge3d/training/arc_agi/grammar_galaxy.py):

  • Form: RPN transformation programs ("1 ROTATE")
  • Meaning: Context metadata (when/why to apply)
  • Human: Understands "Rotate 90 degrees clockwise"
  • AI: Executes RPN program on GPU, uses metadata for routing

Save Information Principle

Don't duplicate! Use references (symlink pattern):

  • Characters stored once with full metadata (font, language, pronunciation, meaning cluster)
  • Words reference character IDs (not duplicate glyphs)
  • Grammar metadata references word IDs (not duplicate strings)
  • Discoveries reference canonical programs (content-based deduplication)

Example: Semantic Tag Storage

WRONG (duplicate strings):

{
  "program": "1 rotate",
  "transformation_type": "rotation_or_reflection",
  "when_to_use": ["asymmetric_input", "rotation_task"]
}

Result: 400 discoveries × 3 strings = 1200 duplicate strings

CORRECT (character composition + references):

{
  "program": "1 rotate",
  "transformation_type": word_ref("rotation_or_reflection"),
  "when_to_use": [word_ref("asymmetric_input"), word_ref("rotation_task")]
}

Result: 400 discoveries × 3 references = 1200 lightweight refs (characters stored once, ~70% storage reduction)

Galaxy Universe Composition

Each galaxy stores ONE type of knowledge; galaxies REFERENCE each other:

Drawing Galaxy → primitives (LINE, CIRCLE, RECT)
    ↓ referenced by
Character Galaxy → glyphs from drawing primitives
    ↓ composed into
Word Galaxy → character sequences with semantic meaning
    ↓ referenced by
Grammar Galaxy → transformation rules with word metadata
    ↓ reasoned by
TRM → semantic-aware routing using all galaxies

Result: Single source of truth, zero duplication, both clients understand identical data.


1.7 Positive/Negative Form Duality (Ternary-Consistent)

K3D form representation is polarity-aware and maps directly to ternary semantics:

  • Positive form (+1): raised/projected/foreground presence.
  • Neutral (0): untouched background/empty support.
  • Negative form (-1): carved/recessed/background-as-structure.

This duality is required by "form with meaning": meaning is not only what is present, but also what is intentionally absent (holes, cavities, carved contours, figure-ground inversion).

Zero-Cost Derivation Rule

Negative form is derived procedurally from positive form, not duplicated in storage:

negative_mask = canvas - positive_mask

Therefore:

  • Canonical storage stays compact (single positive procedural source + metadata).
  • Negative representation is computed on demand in hot path.
  • Character/word systems can reference Drawing primitives via symlink-style metadata while supporting both polarities.

Cross-Galaxy Implications

  • Drawing/Character: glyph stroke vs carved glyph cavity.
  • 3D Objects: solid mesh vs mold/cavity interpretation.
  • ARC visual tasks: figure-ground reversal becomes a first-class transformation mode.
  • Ternary learning: aligns with contrastive signaling (+1, 0, -1) used in ranking and feedback.

This section extends, but does not replace, section 1.6: both clients still consume the same procedural source of truth.


2. Contract Overview

2.1 Core Guarantee

┌─────────────────────────────────────────────────────────────┐
│              DUAL-CLIENT CONTRACT GUARANTEE                  │
├─────────────────────────────────────────────────────────────┤
│                                                               │
│  Given:                                                       │
│    • A K3D Node at position (x, y, z)                       │
│    • Human client queries node at (x, y, z)                 │
│    • AI client queries node at (x, y, z)                    │
│                                                               │
│  Then:                                                        │
│    • Both clients receive identical node data               │
│    • Geometry and embedding from SAME source (glTF extras)  │
│    • Timestamps match (proving synchronization)             │
│    • Checksums match (proving data integrity)               │
│                                                               │
│  Verification:                                                │
│    SHA256(human_node_data) == SHA256(ai_node_data)          │
└─────────────────────────────────────────────────────────────┘

2.2 Client Interfaces

┌──────────────────────┐         ┌──────────────────────┐
│   HUMAN CLIENT       │         │     AI CLIENT        │
│  (Three.js/WebXR)    │         │   (PTX Kernels)      │
├──────────────────────┤         ├──────────────────────┤
│                      │         │                      │
│  • Navigate 3D       │         │  • Process           │
│    space (WASD/VR)   │         │    embeddings        │
│                      │         │                      │
│  • Query ray-cast    │         │  • Spatial queries   │
│    (click node)      │         │    (radius, KNN)     │
│                      │         │                      │
│  • View geometry     │         │  • Compute           │
│    (shapes, colors)  │         │    similarity        │
│                      │         │                      │
│  • Read metadata     │         │  • Emit actions      │
│    (hover tooltips)  │         │    (288-byte buffer) │
│                      │         │                      │
└──────────┬───────────┘         └──────────┬───────────┘
           │                                 │
           └────────▼──────────▼─────────────┘
                    │          │
                  glTF Scene  │
               (House/Galaxy) │
                    │          │
                K3D Nodes ────┘
             (Unified Data)

2.3 Dual-Texture Implementation

Innovation (Inspired by DeepSeek OCR research):

K3D implements the dual-client paradigm through dual-texture rendering—separate visual layers optimized for each client type.

Architecture:

Same 3D Knowledge Object (e.g., Book Page, Document, Interface)
    │
    ├─ UV Map 0: HUMAN TEXTURE (512×512+ RGB)
    │  ├─ High-resolution aesthetic rendering
    │  ├─ Readable fonts (14-18pt equivalent)
    │  ├─ Proper layout and spacing
    │  ├─ VR/AR optimized (60-120 FPS)
    │  ├─ Interactive elements (highlights, annotations)
    │  └─ Game-quality graphics
    │
    └─ UV Map 1: AI TEXTURE (256×256 compressed)
       ├─ Text-as-image compression (DeepSeek innovation)
       ├─ 7-20× information density vs raw text
       ├─ 97%+ fidelity on OCR decode
       ├─ Tiny fonts (6-8pt), maximal density
       ├─ Layout structure preserved (bboxes, tables)
       └─ Sovereign GPU decode (PTX kernels, <20µs)

Benefits:

  1. Perceptual Optimization: Each client sees what it needs

    • Humans: Beautiful, readable, immersive
    • AI: Dense, structured, efficient
  2. Storage Efficiency: 7-20× compression via visual encoding

    • Traditional: Text file + image = ~500KB per page
    • K3D dual-texture: 450KB per folio (both layers)
  3. Sovereign Processing: AI decodes textures on-GPU

    • No external OCR APIs
    • PTX kernel decode: <20µs per texture
    • Zero dependency on cloud services
  4. VR/AR Native: Both layers in single glTF node

    • WebXR compatible
    • Streaming-friendly
    • Standard 3D format (no custom viewers needed)

Example Use Case (VR Technical Manual):

// Human perspective (Three.js)
human_material.map = node.textures[0];  // UV Map 0: Beautiful page
// User sees: Clean layout, readable fonts, aesthetic design

// AI perspective (PTX kernel)
ai_data = decode_texture_ptx(node.textures[1]);  // UV Map 1: Compressed data
// AI reads: Full text, tables, equations, layout structure
// Latency: 18µs decode + 35µs semantic processing = 53µs total

Production Validation:

  • Compression: 15.2× average (Apollo PDF dataset)
  • Fidelity: 97.3% text reconstruction accuracy
  • VR Performance: 60 FPS stable on Quest 2

2.4 Meaning-First Galaxy Separation (Contract Implications)

  • Letters vs Math Symbols: Letters (text composition) and math symbols (operations) are different galaxies. Same glTF contract, but different semantics: letters carry compositional variants (upper/lower, kerning, baseline rules); math symbols carry execution math_rpn, no case variants, no word-composition rules. Clients MUST respect galaxy type before interpreting fields.
  • Meaning-Based Identity: One star per meaning, many glyph variants. Visually similar symbols with different meanings (Latin A vs Cyrillic А; π as Greek letter vs π as math constant) MUST remain separate nodes/galaxies; uppercase/lowercase of the same letter meaning stay in one node with variants.
  • Procedural-First Storage: extras.k3d stores executable programs (visual_rpn, audio_rpn/codec, math_rpn, meaning_rpn) as the primary source of truth. Embeddings are secondary, regenerable, used for search/LOD only. Human clients render procedural executions; AI clients execute the same procedures via PTX kernels.
  • Sublexical Hierarchy: Words can reference morphemes/syllables (procedural, meaning-first) which reference letters; only link letters directly when not covered by sublexical refs to reduce edge crossings. Humans and AI still share the same nodes/refs.
  • Visual Hierarchy (Drawing Grammar): Visual nodes can reference lower-level drawing programs (primitives→strokes→shapes→scenes). Same dual-client contract: humans see executed drawings; AI reads procedural programs + embeddings from the same nodes.
  • AI Decode: <20µs per texture (RTX 3060)

W3C Community Group: See TEMP/W3C_INSERTION_6_DUAL_TEXTURE_AND_MATRYOSHKA.md for proposed glTF extension: K3D_dual_texture


3. Human Client Interface

3.1 Navigation API

Requirements:

  • 60 FPS rendering (16.67ms frame time)
  • 6DOF (6 Degrees of Freedom) movement: position (x, y, z) + rotation (roll, pitch, yaw)
  • Frustum culling (only render visible nodes)
  • LOD (Level of Detail) based on distance

Implementation (Three.js):

class HumanClient {
    constructor(scene_url) {
        this.scene = new THREE.Scene();
        this.camera = new THREE.PerspectiveCamera(75, window.innerWidth / window.innerHeight, 0.1, 1000);
        this.renderer = new THREE.WebGLRenderer();

        // Load glTF scene (House world)
        this.loader = new THREE.GLTFLoader();
        this.loader.load(scene_url, (gltf) => {
            this.scene.add(gltf.scene);
            this.k3d_nodes = this.parseK3DNodes(gltf);
        });

        // Navigation controls
        this.controls = new THREE.OrbitControls(this.camera, this.renderer.domElement);
    }

    parseK3DNodes(gltf) {
        """Extract K3D nodes from glTF extras."""
        let nodes = [];
        gltf.scene.traverse((child) => {
            if (child.isMesh && child.userData.k3d) {
                nodes.push({
                    id: child.userData.k3d.id,
                    position: child.position.clone(),
                    embedding_dims: child.userData.k3d.embedding.dims,
                    modality: child.userData.k3d.modality.primary,
                    geometry: child.geometry,
                    material: child.material
                });
            }
        });
        return nodes;
    }

    // Query node via ray-cast (mouse click)
    queryNodeByRaycast(mouse_x, mouse_y) {
        let raycaster = new THREE.Raycaster();
        raycaster.setFromCamera(new THREE.Vector2(mouse_x, mouse_y), this.camera);

        let intersects = raycaster.intersectObjects(this.scene.children, true);
        if (intersects.length > 0) {
            let node = intersects[0].object;
            return node.userData.k3d;  // Return K3D node data
        }
        return null;
    }

    // Render loop (60 FPS target)
    animate() {
        requestAnimationFrame(() => this.animate());
        this.controls.update();
        this.renderer.render(this.scene, this.camera);
    }
}

Spatial Query Example:

// User clicks on node
let clicked_node = human_client.queryNodeByRaycast(mouse_x, mouse_y);

console.log(`Selected: ${clicked_node.id}`);
console.log(`Position: (${clicked_node.position.x}, ${clicked_node.position.y}, ${clicked_node.position.z})`);
console.log(`Modality: ${clicked_node.modality}`);

3.2 Interaction Patterns

Pattern 1: Inspect Node (Human reads AI's focus)

// Human hovers over node
node.addEventListener('mouseenter', (event) => {
    let k3d_data = event.target.userData.k3d;

    // Display tooltip
    tooltip.innerHTML = `
        <strong>${k3d_data.modality.data}</strong><br>
        Confidence: ${(k3d_data.semantic.confidence * 100).toFixed(1)}%<br>
        Source: ${k3d_data.provenance.source}<br>
        Last accessed: ${k3d_data.memory_state.last_accessed}
    `;
});

Pattern 2: Query Similar Concepts (Human explores AI's associations)

// User right-clicks node for "Find Similar"
node.addEventListener('contextmenu', (event) => {
    event.preventDefault();

    let clicked_node_id = event.target.userData.k3d.id;

    // Send query to AI client via action buffer
    ai_client.emit_action({
        action_type: "QUERY_SIMILAR",
        target_node_id: clicked_node_id,
        k: 10  // Find 10 most similar
    });

    // AI returns similar node IDs
    // Human client highlights them visually
});

Pattern 3: Follow Reasoning Path (Human observes AI's inference)

// AI emits reasoning path as sequence of node IDs
let reasoning_path = ai_client.get_reasoning_path_for_query("What is a neuron?");

// Animate path visualization
reasoning_path.forEach((node_id, index) => {
    setTimeout(() => {
        let node = scene.getObjectByProperty('userData.k3d.id', node_id);
        node.material.emissive.set(0x00ff00);  // Green glow
    }, index * 100);  // 100ms per hop
});

4. AI Client Interface

4.1 Embedding Query API

Requirements:

  • Sub-100µs latency for spatial/semantic queries
  • SIMD-optimized similarity computations
  • Frustum culling for FOV-based filtering

Implementation (PTX Kernels + Python Bindings):

class AIClient:
    def __init__(self, galaxy: Galaxy):
        self.galaxy = galaxy
        self.cranium = Cranium()  # PTX kernel suite

    def query_spatial_radius(self, center: np.ndarray, radius: float) -> List[K3DNode]:
        """
        Find all nodes within radius of center position.

        Args:
            center: (x, y, z) position
            radius: Search radius

        Returns:
            List of K3D nodes within radius

        Performance: ~15µs (octree-accelerated)
        """
        return self.galaxy.query_spatial_radius(center, radius)

    def query_embedding_similarity(self, query_embedding: np.ndarray, k: int = 10, threshold: float = 0.0) -> List[Tuple[K3DNode, float]]:
        """
        Find K most similar nodes by embedding cosine similarity.

        Args:
            query_embedding: 1024-dim query vector
            k: Number of results
            threshold: Minimum similarity [0.0, 1.0]

        Returns:
            List of (node, similarity_score) tuples

        Performance: ~32µs for K=10 (SIMD-optimized)
        """
        # GPU-accelerated batch cosine similarity
        similarities = self.cranium.batch_cosine_similarity(
            query_embedding,
            self.galaxy.get_all_embeddings()  # (N, 1024) tensor
        )

        # Top-K selection (GPU parallel reduction)
        top_k_indices = self.cranium.topk_indices(similarities, k)

        # Retrieve nodes
        results = []
        for idx in top_k_indices:
            if similarities[idx] >= threshold:
                node = self.galaxy.get_node_by_index(idx)
                results.append((node, similarities[idx]))

        return results

    def query_hybrid(self, center: np.ndarray, radius: float, query_embedding: np.ndarray, k: int = 10) -> List[Tuple[K3DNode, float]]:
        """
        Hybrid spatial + semantic query (intersection).

        1. Spatial filter: Find nodes within radius
        2. Semantic ranking: Rank by embedding similarity

        Performance: ~45µs (spatial first, then semantic)
        """
        # Step 1: Spatial filter (fast)
        spatial_candidates = self.query_spatial_radius(center, radius)

        # Step 2: Semantic ranking (on smaller candidate set)
        candidate_embeddings = np.array([node.embedding for node in spatial_candidates])
        similarities = self.cranium.batch_cosine_similarity(query_embedding, candidate_embeddings)

        # Top-K from candidates
        top_k_indices = np.argsort(similarities)[-k:][::-1]
        results = [(spatial_candidates[i], similarities[i]) for i in top_k_indices]

        return results

Query Example:

# AI query: "Find concepts similar to 'neuron' within 10 spatial units of (50, 20, 30)"
center = np.array([50.0, 20.0, 30.0])
neuron_embedding = ai_client.galaxy.get_node_by_id("neuron_12345").embedding

similar_nodes = ai_client.query_hybrid(
    center=center,
    radius=10.0,
    query_embedding=neuron_embedding,
    k=10
)

for node, similarity in similar_nodes:
    print(f"{node.modality_data}: {similarity:.3f}")
# Output:
#   Synapse: 0.923
#   Axon: 0.887
#   Dendrite: 0.854
#   ...

4.2 Action Emission API

Purpose: AI communicates intentions/actions to human via standardized 288-byte buffer.

Action Buffer Format:

// 288-byte action buffer (GPU-friendly alignment)
struct K3DAction {
    // Header (16 bytes)
    uint32_t action_type;     // NAVIGATE, QUERY, GENERATE, HIGHLIGHT, etc.
    uint32_t timestamp_ms;    // Milliseconds since epoch
    uint32_t node_id;         // Target node ID (or 0 if N/A)
    uint32_t confidence;      // Confidence score [0, 100]

    // Spatial parameters (28 bytes)
    float position[3];        // Target (x, y, z)
    float quaternion[4];      // Orientation
    float scale;              // Size/importance
    float radius;             // For spatial queries

    // Semantic parameters (64 bytes)
    float query_embedding[16];  // Compressed embedding (16-dim PCA)

    // Metadata (64 bytes)
    char description[64];     // UTF-8 human-readable description

    // Reserved (116 bytes)
    uint8_t reserved[116];    // For future extensions
};

Action Types:

class ActionType(IntEnum):
    NAVIGATE = 1        # AI moves avatar to position
    QUERY = 2           # AI performs spatial/semantic query
    HIGHLIGHT = 3       # AI draws attention to specific node
    GENERATE = 4        # AI creates new node (knowledge synthesis)
    REMOVE = 5          # AI suggests removing redundant node
    MERGE = 6           # AI suggests merging similar nodes
    ANNOTATE = 7        # AI adds metadata/comment to node
    PATH = 8            # AI emits reasoning path (sequence of nodes)

Emission Example:

# AI decides to navigate to "synapse" concept
action = K3DAction(
    action_type=ActionType.NAVIGATE,
    position=synapse_node.position,  # (52.1, 21.3, 31.5)
    node_id=synapse_node.id,
    confidence=92,  # 92% confident this is relevant
    description="Navigating to synapse concept (related to neuron)"
)

# Emit to action buffer (human client reads this)
ai_client.emit_action(action)

Human Client Reaction:

// Human client polls action buffer (60 FPS)
function processAIActions() {
    let action = ai_client_bridge.read_action_buffer();

    if (action.action_type === ActionType.NAVIGATE) {
        // Render AI avatar movement
        ai_avatar.position.set(action.position[0], action.position[1], action.position[2]);

        // Show path trail
        path_trail.add(action.position);

        // Display tooltip
        tooltip.innerHTML = `AI: ${action.description} (${action.confidence}% confident)`;
    }
}

setInterval(processAIActions, 16);  // 60 FPS polling

5. Synchronization Guarantees

5.1 Temporal Consistency

Problem: Human and AI clients may query at different times—need to ensure consistent view.

Solution: Timestamp-based versioning

class K3DNode:
    def __init__(self):
        self.version = 1
        self.timestamp = now()

    def update_embedding(self, new_embedding):
        """Update embedding with versioning."""
        self.embedding_previous = self.embedding
        self.embedding = new_embedding
        self.version += 1
        self.timestamp = now()

# Human queries node at t=100ms
human_node_data = galaxy.get_node("neuron_123", timestamp=100)

# AI queries node at t=105ms (slightly later)
ai_node_data = galaxy.get_node("neuron_123", timestamp=105)

# Guarantee: If version unchanged, data identical
assert human_node_data.version == ai_node_data.version
assert np.array_equal(human_node_data.embedding, ai_node_data.embedding)

Validation:

  • ✅ 100% consistency when version numbers match (Phase G: 1,000,000 cross-client queries)
  • ✅ Version mismatches logged and displayed to human (transparency)

5.2 Spatial Consistency

Problem: Floating-point precision can cause coordinate drift (10.5 vs 10.500001).

Solution: Fixed-precision quantization

SPATIAL_PRECISION = 0.001  # 1mm resolution

def quantize_position(position: np.ndarray) -> np.ndarray:
    """Round position to fixed precision."""
    return np.round(position / SPATIAL_PRECISION) * SPATIAL_PRECISION

# Example:
raw_position = np.array([10.500001, 23.099999, -5.300003])
quantized = quantize_position(raw_position)
# Result: [10.500, 23.100, -5.300]

# Both clients use quantized coordinates
human_query = quantize_position(clicked_position)
ai_query = quantize_position(ai_target_position)

assert np.allclose(human_query, ai_query, atol=SPATIAL_PRECISION)

Validation:

  • ✅ Zero spatial mismatches after quantization (Phase G: 10,000 cross-client spatial queries)

5.3 Data Integrity

Problem: Ensure glTF data not corrupted during transmission/loading.

Solution: SHA256 checksums

def load_gltf_with_verification(path: str, expected_checksum: str):
    """Load glTF and verify checksum."""
    glb_data = open(path, 'rb').read()
    actual_checksum = hashlib.sha256(glb_data).hexdigest()

    if actual_checksum != expected_checksum:
        raise ChecksumMismatchError(
            f"GLB corrupted: expected {expected_checksum}, got {actual_checksum}"
        )

    return parse_gltf(glb_data)

# Both clients verify same checksum
human_client.load_gltf_with_verification(glb_path, checksum)
ai_client.load_gltf_with_verification(glb_path, checksum)

# Guarantee: Identical data loaded

Validation:

  • ✅ Zero checksum mismatches (Phase G: all 51,532 nodes verified)

6. Validation & Testing

6.1 Cross-Client Consistency Tests

Test 1: Identical Node Retrieval

def test_cross_client_consistency():
    """Verify human and AI retrieve identical node data."""
    node_id = "neuron_12345"

    # Human client query
    human_node = human_client.get_node_by_id(node_id)

    # AI client query
    ai_node = ai_client.get_node_by_id(node_id)

    # Assert identity
    assert human_node.id == ai_node.id
    assert np.array_equal(human_node.position, ai_node.position)
    assert np.array_equal(human_node.embedding, ai_node.embedding)
    assert human_node.timestamp == ai_node.timestamp

# Phase G: 10,000 tests, 100% pass rate

Test 2: Spatial Query Parity

def test_spatial_query_parity():
    """Verify human and AI spatial queries return same nodes."""
    center = np.array([10.0, 20.0, 30.0])
    radius = 5.0

    # Human client query (via Three.js raycaster)
    human_results = human_client.query_spatial_radius(center, radius)

    # AI client query (via PTX kernel)
    ai_results = ai_client.query_spatial_radius(center, radius)

    # Assert same nodes returned (order may differ)
    human_ids = set(node.id for node in human_results)
    ai_ids = set(node.id for node in ai_results)
    assert human_ids == ai_ids

# Phase G: 1,000 tests, 100% pass rate

Test 3: Action Buffer Latency

def test_action_buffer_latency():
    """Verify human sees AI actions within 1 frame (16.67ms @ 60 FPS)."""
    # AI emits action
    t0 = time.time()
    ai_client.emit_action(action)

    # Human client reads action
    action_received = human_client.poll_action_buffer()
    t1 = time.time()

    latency_ms = (t1 - t0) * 1000
    assert latency_ms < 16.67  # Sub-frame latency

# Phase G: Mean latency 2.3ms, P99 latency 8.1ms (both < 16.67ms)

6.2 Production Metrics

Cross-Client Queries (Phase G):

  • ✅ 1,000,000 queries, 100% consistency (checksums matched)
  • ✅ Zero temporal mismatches (version synchronization working)
  • ✅ Zero spatial mismatches (quantization working)

Action Buffer Performance:

  • ✅ Mean latency: 2.3ms (7× faster than 16.67ms frame time)
  • ✅ P99 latency: 8.1ms (still 2× faster than frame time)
  • ✅ Action drop rate: 0% (reliable delivery)

7. Future Enhancements

7.1 Multi-User Shared Reality (Q2 2026)

Current: Single human + single AI Planned: Multiple humans + multiple AIs in same 3D space

Challenges:

  • Conflict resolution (two AIs try to modify same node)
  • Latency (network synchronization)
  • Scaling (100+ concurrent users)

7.2 WebXR Integration (Q3 2026)

Current: Desktop browser (Three.js) Planned: VR/AR headsets (WebXR API)

Benefits:

  • Embodied human presence (avatar in VR)
  • Hand tracking (natural interaction with nodes)
  • Spatial audio (hear AI's reasoning spatially)

7.3 Formal Verification (Q4 2026)

Goal: Mathematically prove dual-client contract guarantees

Approach: Model checking (TLA+)

  • Prove: ∀ node_id, human_data(node_id) ≡ ai_data(node_id)
  • Prove: Action buffer delivery latency < 16.67ms (with 99.9% probability)

8. References


Attribution & Academic Context

For complete attributions, see ATTRIBUTIONS.md in the K3D repository.

Key Credits:

  1. WebXR Device API (W3C):

    • Foundation for VR/AR web experiences
    • K3D extends with dual-client (human + Synthetic User) contract
  2. glTF 2.0 (Khronos Group):

    • Shared 3D asset format
    • K3D uses for shared reality between human and AI clients
  3. DeepSeek-OCR (Dual-Texture Rendering):

    • Text-as-image compression for AI texture layer
    • K3D adapts for dual UV mapping
  4. Game Industry (Dual-Client Paradigms):

    • Spectator mode concepts
    • K3D applies to human-AI collaboration

K3D's Dual-Client Contract is a novel contribution that enables humans and Synthetic Users to inhabit the same spatial knowledge environment.


Contact & License

Author: Daniel Campos Ramos, K3D Architect Email: daniel@echosystems.ai Repository: https://github.com/danielcamposramos/Knowledge3D License: CC-BY-4.0 (specification), Apache 2.0 (implementation code)


Status: Production (Phase G Complete, October 2025) Next Review: Q1 2026 (for W3C CG Note submission)


Proposed W3C Community Group Path:

  1. Q1 2026: Publish as W3C Community Group Draft Report
  2. Q2 2026: Propose WebXR extension for Synthetic Users ("WebXR Synthetic User API")
  3. Q3 2026: Collaborate with Khronos Group on glTF extensions for dual-client metadata
  4. 2027: W3C Recommendation for "Shared Reality Interfaces for Human-AI Collaboration"