Skip to content

Add MSCCLPP_IB_GID_INDEX env#780

Open
Binyang2014 wants to merge 40 commits intomainfrom
binyli/ib-no-atomic-test
Open

Add MSCCLPP_IB_GID_INDEX env#780
Binyang2014 wants to merge 40 commits intomainfrom
binyli/ib-no-atomic-test

Conversation

@Binyang2014
Copy link
Copy Markdown
Contributor

Use MSCCLPP_IB_GID_INDEX to control ib gid index

chhwang and others added 30 commits February 21, 2026 00:02
- Change executor to create one connection (unique QP) per channel entry
  instead of sharing connections per peer. This is required for HostNoAtomic
  IB mode where each connection can only forward signals to one semaphore
  via setSignalForwardingDst.

- Add MSCCLPP_IB_GID_INDEX environment variable to override the default
  GID index (3) used for IB transport. Set to the desired GID index value,
  or leave unset/-1 to use the default.
mahdiehghazim and others added 10 commits March 17, 2026 20:43
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add ibGidIndex to Env (default 0)
- Change DefaultGidIndex to -1 (unspecified)
- Resolve in endpoint.cc: explicit value >= 0 takes priority, otherwise use env

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@Binyang2014 Binyang2014 marked this pull request as ready for review April 9, 2026 21:36
@Binyang2014 Binyang2014 requested review from a team and Copilot April 10, 2026 22:32
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an environment-variable override for the InfiniBand GID index used when creating IB queue pairs, so deployments can tune GID selection without recompiling or changing call sites.

Changes:

  • Introduces MSCCLPP_IB_GID_INDEX in the global Env and logs it when set.
  • Resolves EndpointConfig::Ib::gidIndex from the env var when the config value is unspecified (sentinel).
  • Exposes the new env value through the Python CppEnv bindings and updates the public default/sentinel in EndpointConfig::Ib.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/core/env.cpp Reads and logs MSCCLPP_IB_GID_INDEX into the global environment object.
src/core/endpoint.cc Applies env-based fallback for gidIndex before creating IB QPs.
python/csrc/env_py.cpp Exposes ib_gid_index via nanobind.
include/mscclpp/env.hpp Adds Env::ibGidIndex with documentation for the new env var.
include/mscclpp/core.hpp Changes EndpointConfig::Ib default GID index sentinel and documents env fallback behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants