Skip to content

Commit d5e7844

Browse files
mivertowskiclaude
andcommitted
Add Send/Sync impls for DirectPtxModule and DirectCooperativeKernel
CUDA module and function handles are context-bound and thread-safe since CUDA 4.0+. This was the only CUDA wrapper type in the crate missing these impls, preventing downstream use in async runtimes that require Send+Sync state (e.g. axum/tokio servers). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
1 parent 8a706b9 commit d5e7844

1 file changed

Lines changed: 17 additions & 0 deletions

File tree

crates/ringkernel-cuda/src/driver_api.rs

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -156,6 +156,14 @@ impl Drop for DirectPtxModule {
156156
}
157157
}
158158

159+
// SAFETY: CUmodule handles are bound to a CUDA context, and since CUDA 4.0+
160+
// contexts can be accessed from any host thread. The CUDA driver serializes
161+
// operations on the same context internally. Callers must ensure the CUDA
162+
// context remains valid for the lifetime of this module (which is guaranteed
163+
// by the Drop impl that calls cuModuleUnload).
164+
unsafe impl Send for DirectPtxModule {}
165+
unsafe impl Sync for DirectPtxModule {}
166+
159167
/// A cooperative kernel loaded via direct driver API.
160168
///
161169
/// This provides true cooperative launch capability without relying on cudarc's
@@ -173,6 +181,15 @@ pub struct DirectCooperativeKernel {
173181
func_name: String,
174182
}
175183

184+
// SAFETY: CUfunction handles are derived from CUmodule and share the same
185+
// thread-safety guarantees. The CUDA driver API serializes kernel launches
186+
// on the same stream. DirectCooperativeKernel holds the parent module via
187+
// _module, ensuring the module (and its functions) remain valid. All mutable
188+
// state (block_size, max_blocks) is initialized at construction and read-only
189+
// thereafter. Concurrent launches on different streams are safe per CUDA spec.
190+
unsafe impl Send for DirectCooperativeKernel {}
191+
unsafe impl Sync for DirectCooperativeKernel {}
192+
176193
impl DirectCooperativeKernel {
177194
/// Load a cooperative kernel from PTX source.
178195
///

0 commit comments

Comments
 (0)