Skip to content

Latest commit

 

History

History
100 lines (71 loc) · 6.08 KB

File metadata and controls

100 lines (71 loc) · 6.08 KB

Architecture

Overview

BlueCache enables high-throughput, CPU-bypass data movement between Host GPU memory and BlueField DPU memory/storage. It is implemented as two companion components:

  1. NIXL backend plugin (BLUE_CACHE) — runs on the host, inside a NIXL application.
  2. DPU agent (blue-cache) — runs on the BlueField DPU ARM cores.
Host                                              DPU
┌─────────────────────────────┐                  ┌─────────────────────────────┐
│  Application (LMCache etc.) │                  │  blue-cache agent           │
│         │                   │                  │  ┌─────────────────────┐    │
│         ▼                   │   DOCA Comch     │  │ DOCA DMA engine     │    │
│  ┌─────────────┐            │   or TCP         │  │  ┌───────────────┐  │    │
│  │ NIXL agent  │            │◄────────────────►│  │  │ staging buffer │  │    │
│  │ + BLUE_     │            │   control        │  │  └───────┬───────┘  │    │
│  │   CACHE     │            │   messages       │  │          │          │    │
│  └──────┬──────┘            │                  │  │          ▼          │    │
│         │                   │                  │  │  ┌───────────────┐  │    │
│         ▼ VRAM_SEG          │                  │  │  │ NIXL storage  │  │    │
│  ┌─────────────┐            │                  │  │  │ backend       │  │    │
│  │ GPU buffer  │            │   DOCA DMA       │  │  │ (POSIX/OBJ)   │  │    │
│  │ (exported   │◄───────────┴──────────────────►│  │  └───────┬───────┘  │    │
│  │  via PCI)   │                                │  │          │          │    │
│  └─────────────┘                                │  │          ▼          │    │
│                                                 │  │  ┌───────────────┐  │    │
│                                                 │  │  │ local storage │  │    │
│                                                 │  │  └───────────────┘  │    │
└─────────────────────────────────────────────────┘  └─────────────────────┘    │

Memory Types

The plugin exposes two NIXL memory types:

  • VRAM_SEG — Host GPU memory. Registered buffers are exported via doca_mmap_export_pci() so the DPU can import them.
  • OBJ_SEG — DPU-resident object. The metaInfo field carries the object path or key; the actual I/O happens on the DPU.

The backend is local-only (supportsRemote() == false). Both source and destination must be reachable through the same host-side BlueField PCI function.

Control Plane

The host plugin and DPU agent communicate through a small, request/response protocol defined in common/include/dma_transfer.h.

Two transports are supported:

  • DOCA Communication Channel (Comch) — default. Uses DOCA Comch to send messages without requiring a reachable DPU management IP.
  • TCP — fallback on port 18517. Used when Comch is unavailable or misconfigured.

The control plane carries only metadata: operation direction, file/object path, remote GPU address, PCI export descriptor, and status. Bulk data never traverses the control plane.

Data Plane

For a write (NIXL_WRITE, GPU → DPU storage):

  1. Host plugin creates a DOCA mmap over the GPU buffer and exports it.
  2. Host plugin sends a batch of DMA_REQ_BATCH_PUSH requests.
  3. DPU agent imports the remote mmap.
  4. DPU agent DMAs chunks from GPU into its pre-allocated staging buffer.
  5. DPU agent writes chunks to storage via its NIXL storage backend.

For a read (NIXL_READ, DPU storage → GPU):

  1. Host plugin queries the object size (DMA_REQ_PULL_INFO).
  2. Host plugin exports a writable GPU mmap and sends DMA_REQ_BATCH_PULL.
  3. DPU agent uses a pipelined reader/DMA worker to overlap storage reads with DMA back to GPU.

blue-cache Internals

The DPU agent (blue-cache/src/blue_cache_agent.c) maintains:

  • A single DOCA DMA device/context/progress engine.
  • A reusable staging buffer registered once at startup.
  • A slot pool sized by queue_depth.
  • A pluggable storage backend (storage_backend.cpp) implemented on top of NIXL:
    • posix_storage_backend — local files via NIXL FILE_SEG.
    • xdfs_storage_backend — object storage via NIXL OBJ_SEG (optional).
    • xdfs_kv_storage_backend — object storage with key validation (optional).

Host Plugin Internals

The host plugin (nixl-plugin/src/blue_cache_backend.cpp) implements the NIXL backend engine interface:

  • registerMem / deregisterMem — manage DOCA mmaps for GPU buffers and object descriptors.
  • prepXfer / postXfer — validate descriptor pairs and spawn a worker thread that sends batched control requests.
  • checkXfer — poll the asynchronous request handle state.

Transfers are asynchronous: postXfer returns NIXL_IN_PROG and a background worker drives the control channel.

Wire Protocol Versioning

common/include/dma_transfer.h defines:

  • Magic number DMA_TRANSFER_MAGIC (0x44545246 — "DTRF").
  • Protocol version DMA_TRANSFER_VERSION.
  • Request/response structs including batched segment layout.

Both the host plugin and DPU agent must use the same protocol version. The new project keeps this file as a single source of truth.