Skip to content

Add host-IB zero-copy transport + per-peer mapper cache#2736

Open
Regina8023 wants to merge 1 commit into
meta-pytorch:mainfrom
Regina8023:export-D106138307
Open

Add host-IB zero-copy transport + per-peer mapper cache#2736
Regina8023 wants to merge 1 commit into
meta-pytorch:mainfrom
Regina8023:export-D106138307

Conversation

@Regina8023
Copy link
Copy Markdown
Contributor

Summary:
Introduce the host-side IB P2P transport stack, end-to-end for the
pure zero-copy (ZC) mode. This is the first of two diffs; the
copy-based (CB) mode lands in a follow-up.

What's in this diff:

  • ctran::transport::IP2pHostTransport — backend-neutral abstract
    interface algorithms see. Carries pure ctrl-msg primitives
    (iSendCtrl/iRecvCtrl take raw void* payloads + size) and
    mode-agnostic SendChunkArgs / RecvChunkArgs.
  • ctran::transport::ib::HostZcTransport — pure-ZC concrete impl. No
    staging buffers, no state machine. iSendChunk is a thin wrapper
    on vcs_[vcIdx]->iput; iRecvChunk bumps per-VC notify counters.
  • ctran::transport::ib::impl — shared ctrl-plane helpers
    (ensureVcsReady, iSendCtrlImpl, iRecvCtrlImpl, testCtrlDoneImpl,
    waitCtrlDoneImpl, progressImpl, exportRecvBuf, importRemoteInfo,
    toIbRegElem, CtrlRequestPriv friend struct).
  • CtranMapper::getP2pTransport(peer, mode) per-peer cache. The
    kCopyBased branch is a temporary FB_CHECKABORT pointing at the
    follow-up diff.
  • MockP2pHostTransport.h for unit-testing algos against the
    interface.
  • ctran_mapper_host_transport_ut: cache lifecycle + per-peer
    caching tests (ZC scope).
  • host_transport_dist_ut: end-to-end ZC RoundTrip +
    BidirectionalRoundTrip across {1, 2, 4} channels on real IB.
  • HostTransportDesign.md: full architecture doc describing both ZC
    and CB. CB sections describe the API + state machine the
    follow-up diff implements.

Reviewed By: minsii

Differential Revision: D106138307

Summary:
Introduce the host-side IB P2P transport stack, end-to-end for the
pure zero-copy (ZC) mode. This is the first of two diffs; the
copy-based (CB) mode lands in a follow-up.

What's in this diff:
- ctran::transport::IP2pHostTransport — backend-neutral abstract
  interface algorithms see. Carries pure ctrl-msg primitives
  (iSendCtrl/iRecvCtrl take raw void* payloads + size) and
  mode-agnostic SendChunkArgs / RecvChunkArgs.
- ctran::transport::ib::HostZcTransport — pure-ZC concrete impl. No
  staging buffers, no state machine. iSendChunk is a thin wrapper
  on vcs_[vcIdx]->iput; iRecvChunk bumps per-VC notify counters.
- ctran::transport::ib::impl — shared ctrl-plane helpers
  (ensureVcsReady, iSendCtrlImpl, iRecvCtrlImpl, testCtrlDoneImpl,
  waitCtrlDoneImpl, progressImpl, exportRecvBuf, importRemoteInfo,
  toIbRegElem, CtrlRequestPriv friend struct).
- CtranMapper::getP2pTransport(peer, mode) per-peer cache. The
  kCopyBased branch is a temporary FB_CHECKABORT pointing at the
  follow-up diff.
- MockP2pHostTransport.h for unit-testing algos against the
  interface.
- ctran_mapper_host_transport_ut: cache lifecycle + per-peer
  caching tests (ZC scope).
- host_transport_dist_ut: end-to-end ZC RoundTrip +
  BidirectionalRoundTrip across {1, 2, 4} channels on real IB.
- HostTransportDesign.md: full architecture doc describing both ZC
  and CB. CB sections describe the API + state machine the
  follow-up diff implements.

Reviewed By: minsii

Differential Revision: D106138307
@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Meta Open Source bot. label May 30, 2026
@meta-codesync
Copy link
Copy Markdown
Contributor

meta-codesync Bot commented May 30, 2026

@Regina8023 has exported this pull request. If you are a Meta employee, you can view the originating Diff in D106138307.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot. fb-exported meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant