Add multi-GPU vector addition challenge + script support by kunal-mansukhani · Pull Request #236 · AlphaGPU/leetgpu-challenges

kunal-mansukhani · 2026-04-05T01:35:38Z

Summary

New Pro-only multi-GPU test challenge at challenges/easy/100_multi_gpu_vector_add (num_gpus=2), with starter + solution for all 5 accelerated languages (PyTorch, CUDA, Triton, CuTe, JAX). Used to validate the multi-GPU runner path end-to-end in the companion infra PR (AlphaGPU/leetgpu-infra#333).

Plus two script updates that were needed to make multi-GPU submissions work via run_challenge.py.

Changes

New challenge: `100_multi_gpu_vector_add`

challenge.py: num_gpus=2, access_tier="pro". Reference implementation is single-device (torch.add(A, B, out=C)) — the runner runs it independently on each rank and validates per-rank outputs.
challenge.html: spec explaining the fully-replicated input/output model and the env vars exposed to solve() (RANK, WORLD_SIZE, LOCAL_RANK, LEETGPU_NCCL_ID_FILE).
Starters + solutions for PyTorch (dist.all_reduce), CUDA (NCCL via LEETGPU_NCCL_ID_FILE), Triton (@triton.jit slice kernel + dist.all_reduce), CuTe (plain Python host + dist.all_reduce), JAX (trivial, validates jax.distributed.initialize).

`scripts/update_challenges.py`

Forward num_gpus from ChallengeBase to the backend (field was declared but never propagated).

`scripts/run_challenge.py`

Auto-detect num_gpus from challenge.py via regex and send gpuCount in the submission payload; new --gpu-count override flag.
Prefer language-tagged solution filenames (solution.triton.py, solution.cute.py, solution.jax.py) so multiple Python-based languages can coexist in one solution/ dir. Falls back to solution.<ext>.
Print all stdout + stderr WebSocket frames (was silently dropping type=stderr output).
Terminate on test-case-failed / compilation-failed / tampering-detected / out-of-memory in addition to the previous terminal statuses.

Test plan

All validated end-to-end against the companion infra branch on real Modal hardware:

PyTorch — run + full submit on 2× T4 and 4× T4
CUDA — run + full submit on 2× T4
Triton — run + full submit on 2× T4
CuTe — run + full submit on 2× T4
JAX — run + full submit on 2× T4

Notes

The JAX solution is intentionally trivial (each rank computes the full result locally). It validates that jax.distributed.initialize succeeds and the output flows back correctly; a real distributed JAX pattern would use jax.experimental.multihost_utils / pjit.
This PR depends on AlphaGPU/leetgpu-infra#333 — without the infra changes, the script will try to submit with gpuCount=2 and the server will reject/coerce it.

🤖 Generated with Claude Code

New Pro-only challenge challenges/easy/100_multi_gpu_vector_add (num_gpus=2) with starter + solution for all 5 accelerated languages (PyTorch, CUDA, Triton, CuTe, JAX). Validates the multi-GPU runner path end-to-end on 2x and 4x T4 via ncclAllReduce / dist.all_reduce. scripts/update_challenges.py: forward num_gpus from ChallengeBase to the backend (was previously unused). scripts/run_challenge.py: - Auto-detect num_gpus from challenge.py and send gpuCount in the submission payload (with a --gpu-count override flag). - Prefer language-tagged solution filenames (solution.triton.py, solution.cute.py, etc.) so multiple Python-based languages can coexist in one solution/ directory. - Print all stdout + stderr websocket frames, and terminate on test-case-failed / compilation-failed / out-of-memory. Companion PR on leetgpu-infra: AlphaGPU/leetgpu-infra#333. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Addresses a code review finding: the previous `parse_num_gpus` helper matched any `num_gpus=N` occurrence in the source, which breaks on comments, docstrings, or non-literal assignments. Mirror the loading dance from scripts/update_challenges.py: importlib.spec the module, instantiate Challenge(), read num_gpus as an attribute. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

kunal-mansukhani requested review from ishaan-arya and shxjames as code owners April 5, 2026 01:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add multi-GPU vector addition challenge + script support#236

Add multi-GPU vector addition challenge + script support#236
kunal-mansukhani wants to merge 2 commits intomainfrom
kunal/multi-gpu-challenges

kunal-mansukhani commented Apr 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kunal-mansukhani commented Apr 5, 2026

Summary

Changes

New challenge: 100_multi_gpu_vector_add

scripts/update_challenges.py

scripts/run_challenge.py

Test plan

Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

New challenge: `100_multi_gpu_vector_add`

`scripts/update_challenges.py`

`scripts/run_challenge.py`