Skip to content

Replaced cusparse wrappers with simple unique_ptrs and more RAII #1342

Open
Bubullzz wants to merge 8 commits into
NVIDIA:mainfrom
Bubullzz:replace_wrappers_with_unique_ptrs
Open

Replaced cusparse wrappers with simple unique_ptrs and more RAII #1342
Bubullzz wants to merge 8 commits into
NVIDIA:mainfrom
Bubullzz:replace_wrappers_with_unique_ptrs

Conversation

@Bubullzz

@Bubullzz Bubullzz commented May 29, 2026

Copy link
Copy Markdown
Contributor

The idea is to replace the big handmade wrappers for cusparse objects with more robust std::unique_ptr

Trying to clean the code a bit as a side task while waiting for other jobs to finish. This is a subset of the full PR to get feedback from the team before going forward.

@Bubullzz Bubullzz requested a review from a team as a code owner May 29, 2026 15:24
@Bubullzz Bubullzz requested review from chris-maes and rg20 May 29, 2026 15:24
@copy-pr-bot

copy-pr-bot Bot commented May 29, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@Bubullzz Bubullzz added the do not merge Do not merge if this flag is set label May 29, 2026
@Bubullzz Bubullzz changed the title Replaced wrappers with simple unique_ptrs and more RAII Replaced cusparse wrappers with simple unique_ptrs and more RAII May 29, 2026
@coderabbitai

coderabbitai Bot commented May 29, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: b509301e-e411-4ea1-80a5-e2338bd4e632

📥 Commits

Reviewing files that changed from the base of the PR and between 036469d and 520218a.

📒 Files selected for processing (2)
  • cpp/src/barrier/barrier.cu
  • cpp/src/pdlp/cusparse_view.cu
💤 Files with no reviewable changes (1)
  • cpp/src/barrier/barrier.cu
🚧 Files skipped from review as they are similar to previous changes (1)
  • cpp/src/pdlp/cusparse_view.cu

📝 Walkthrough

Walkthrough

Refactors cuSPARSE descriptor lifetime: introduces owning RAII types and non-owning descriptor views with factory helpers, rebuilds constructors and factories to create RAII descriptors, and updates all cuSPARSE buffer-size/preprocess/compute call sites to pass descriptor pointers via .get().

Changes

cuSPARSE Descriptor RAII Migration

Layer / File(s) Summary
RAII descriptor types and factory helpers
cpp/src/pdlp/cusparse_view.hpp
Adds custom deleters, owning alias types (cusparse_sp_mat_uptr, cusparse_dn_vec_uptr, cusparse_dn_mat_uptr), non-owning view aliases (cusparse_*_descr_view), and factory declarations (make_csr, make_dnvec, make_dnmat, plus SpMVOp factories for CUDA≥13.2).
Barrier module adoption of RAII descriptors
cpp/src/barrier/cusparse_info.hpp, cpp/src/barrier/cusparse_view.{hpp,cu}, cpp/src/barrier/barrier.cu
Changes cusparse_info_t and cusparse_view_t to store owning descriptors, removes manual destructor teardown, changes create_vector to return uptr, updates iteration_data_t to use cusparse_dn_vec_uptr, adjusts gpu_adat_multiply signature to use descriptor views, and rewires barrier cuSPARSE calls to use .get().
PDLP cusparse_view constructors and SpMVOp plans
cpp/src/pdlp/cusparse_view.{hpp,cu}
Rebuilds PDLP cusparse_view_t constructors to instantiate matrices/vectors/tmp/batch descriptors via make_*, updates mixed-precision descriptor creation to use make_csr, updates buffer-size/preprocess calls and SpMVOp plan creation to accept descr_view and use .get(), and documents restart reconstruction semantics.
SpMM/SpGEMM and sparse-kernel updates
cpp/src/barrier/sparse_matrix_kernels.cuh, cpp/src/pdlp/cusparse_view.cu
Updates SpGEMM/SpMM helpers and sparse-matrix kernel paths to create RAII descriptors, wrap SpGEMMDescr_t in unique_ptr, and pass descriptors via .get() into cusparseSpGEMM and related APIs.
Optimal batch size & benchmarks
cpp/src/pdlp/optimal_batch_size_handler/optimal_batch_size_handler.cu
Changes benchmark context and evaluate_node signatures to accept descr_view; constructs dense x_descr/y_descr via make_dnmat and calls cusparseSpMM APIs with .get() handles.
PDHG/PDLP runtime call-site wiring
cpp/src/pdlp/pdhg.cu, cpp/src/pdlp/pdlp.cu
Updates compute_next_dual_solution, spmvop_At_y/spmvop_A_x, compute_At_y, compute_A_x, update_solution, PDLP batch-resize, compute_fixed_error, and initial step-size paths to pass .get() pointers for matrices, vectors, and plans.
Restart, termination, and infeasibility
cpp/src/pdlp/restart_strategy/*, cpp/src/pdlp/termination_strategy/*
Restart constructor recreates descriptors from saved sizes/pointers and rebinds tmp buffers; convergence, infeasibility, and termination routines updated to pass .get() descriptor pointers into SpMV/SpMM calls.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 5.56% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly describes the main change: replacing custom cusparse wrappers with std::unique_ptr-based RAII, which aligns with the changeset's core refactoring.
Description check ✅ Passed The description is related to the changeset, explaining the motivation to replace handmade wrappers with std::unique_ptr for improved robustness and code cleanliness.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cpp/src/pdlp/cusparse_view.cu (1)

240-242: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Missing has_value() check before dereferencing optional.

func at line 242 is dereferenced without checking if the dlsym lookup succeeded.

Proposed fix
 void cusparse_spmvop_run(cusparseHandle_t handle,
                          cusparseSpMVOpPlan_t plan,
                          const void* alpha,
                          const void* beta,
                          cusparse_dn_vec_descr_view vecX,
                          cusparse_dn_vec_descr_view vecY,
                          cusparse_dn_vec_descr_view vecZ,
                          cudaStream_t stream)
 {
   static const auto func = dynamic_load_runtime::function<cusparseSpMVOp_sig>("cusparseSpMVOp");
+  cuopt_expects(func.has_value(), "cusparseSpMVOp symbol not found at runtime");
   RAFT_CUSPARSE_TRY(cusparseSetStream(handle, stream));
   RAFT_CUSPARSE_TRY((*func)(handle, plan, alpha, beta, vecX, vecY, vecZ));
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cpp/src/pdlp/cusparse_view.cu` around lines 240 - 242, The code dereferences
the optional dynamic_load_runtime::function<cusparseSpMVOp_sig> named func
without checking it; before calling (*func)(handle, plan, alpha, beta, vecX,
vecY, vecZ) add a check like if (!func.has_value()) and handle the failure (log
or return an error/throw) with a clear message including the symbol name
"cusparseSpMVOp"; keep the existing RAFT_CUSPARSE_TRY usage for actual cuSPARSE
calls and ensure the early error path prevents the dereference of func and
returns/propagates an appropriate error.
🧹 Nitpick comments (2)
cpp/src/pdlp/cusparse_view.hpp (1)

38-57: 💤 Low value

Consider _t suffix for deleter types per project naming conventions.

The coding guidelines specify types/structs should use snake_case_t with _t suffix (e.g., cusparse_sp_mat_deleter_t). However, the current naming follows common STL deleter patterns, so this is a stylistic choice.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cpp/src/pdlp/cusparse_view.hpp` around lines 38 - 57, The structs
cusparse_sp_mat_deleter, cusparse_dn_vec_deleter, and cusparse_dn_mat_deleter do
not follow the project naming convention requiring a snake_case_t suffix; rename
them to cusparse_sp_mat_deleter_t, cusparse_dn_vec_deleter_t, and
cusparse_dn_mat_deleter_t respectively, update all uses/typedefs/usings in this
compilation unit and any headers that reference these types (e.g., unique_ptr
deleter specializations or variable declarations), and ensure the operator()
implementations remain unchanged and still call RAFT_CUSPARSE_TRY_NO_THROW on
cusparseDestroySpMat/cusparseDestroyDnVec/cusparseDestroyDnMat.
cpp/src/pdlp/cusparse_view.cu (1)

147-149: 💤 Low value

Inconsistent error-checking macro.

Line 147 uses CUSPARSE_CHECK while other cuSPARSE calls in this file use RAFT_CUSPARSE_TRY. Consider using RAFT_CUSPARSE_TRY for consistency.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cpp/src/pdlp/cusparse_view.cu` around lines 147 - 149, Replace the
inconsistent CUSPARSE_CHECK call with the RAFT_CUSPARSE_TRY macro to match the
rest of the file: change the call to cusparseSetStream(...) so it is wrapped
with RAFT_CUSPARSE_TRY rather than CUSPARSE_CHECK, keeping the same arguments
and preserving the subsequent RAFT_CUSPARSE_TRY(cusparseSpMM_preprocess(...))
call; ensure you reference and use the RAFT_CUSPARSE_TRY macro for both
cusparseSetStream and cusparseSpMM_preprocess to maintain consistent error
handling.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cpp/src/pdlp/cusparse_view.cu`:
- Around line 213-218: The code dereferences the optional dynamic loader result
fn (dynamic_load_runtime::function<cusparseSpMVOp_createDescr_sig>) without
checking presence; update the factory that creates the cusparseSpMVOp_descr (the
block that calls (*fn)(...)) to first test fn.has_value() (or if(!fn) branch)
and handle the missing symbol by returning an empty cusparse_spmvop_descr_uptr
(or otherwise propagate a clear error) instead of dereferencing; ensure the
handling is applied where cusparseSpMVOp_createDescr is invoked so callers of
is_cusparse_runtime_spmvop_supported() are not assumed sufficient.
- Around line 221-228: The dynamic loader result `fn` in make_spmvop_plan is
dereferenced without checking it exists; update make_spmvop_plan to test
dynamic_load_runtime::function<...> fn for presence (e.g., if (!fn) throw or
return an error) before calling (*fn)(...), mirroring the fix used in
make_spmvop_descr so that cusparseSpMVOp_createPlan is only invoked when the
symbol lookup succeeded and RAFT_CUSPARSE_TRY is reached with a valid function
pointer.

---

Outside diff comments:
In `@cpp/src/pdlp/cusparse_view.cu`:
- Around line 240-242: The code dereferences the optional
dynamic_load_runtime::function<cusparseSpMVOp_sig> named func without checking
it; before calling (*func)(handle, plan, alpha, beta, vecX, vecY, vecZ) add a
check like if (!func.has_value()) and handle the failure (log or return an
error/throw) with a clear message including the symbol name "cusparseSpMVOp";
keep the existing RAFT_CUSPARSE_TRY usage for actual cuSPARSE calls and ensure
the early error path prevents the dereference of func and returns/propagates an
appropriate error.

---

Nitpick comments:
In `@cpp/src/pdlp/cusparse_view.cu`:
- Around line 147-149: Replace the inconsistent CUSPARSE_CHECK call with the
RAFT_CUSPARSE_TRY macro to match the rest of the file: change the call to
cusparseSetStream(...) so it is wrapped with RAFT_CUSPARSE_TRY rather than
CUSPARSE_CHECK, keeping the same arguments and preserving the subsequent
RAFT_CUSPARSE_TRY(cusparseSpMM_preprocess(...)) call; ensure you reference and
use the RAFT_CUSPARSE_TRY macro for both cusparseSetStream and
cusparseSpMM_preprocess to maintain consistent error handling.

In `@cpp/src/pdlp/cusparse_view.hpp`:
- Around line 38-57: The structs cusparse_sp_mat_deleter,
cusparse_dn_vec_deleter, and cusparse_dn_mat_deleter do not follow the project
naming convention requiring a snake_case_t suffix; rename them to
cusparse_sp_mat_deleter_t, cusparse_dn_vec_deleter_t, and
cusparse_dn_mat_deleter_t respectively, update all uses/typedefs/usings in this
compilation unit and any headers that reference these types (e.g., unique_ptr
deleter specializations or variable declarations), and ensure the operator()
implementations remain unchanged and still call RAFT_CUSPARSE_TRY_NO_THROW on
cusparseDestroySpMat/cusparseDestroyDnVec/cusparseDestroyDnMat.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 43105512-0a29-4bb5-b0f5-b5cd32da0e79

📥 Commits

Reviewing files that changed from the base of the PR and between ea7acf0 and e471907.

📒 Files selected for processing (5)
  • cpp/src/barrier/barrier.cu
  • cpp/src/barrier/cusparse_info.hpp
  • cpp/src/barrier/cusparse_view.cu
  • cpp/src/pdlp/cusparse_view.cu
  • cpp/src/pdlp/cusparse_view.hpp

Comment thread cpp/src/pdlp/cusparse_view.cu
Comment thread cpp/src/pdlp/cusparse_view.cu
@mlubin

mlubin commented May 29, 2026

Copy link
Copy Markdown
Contributor

Trying to clean the code a bit as a side task while waiting for other jobs to finish.

I love the clean up effort! Leaving the review for the experts.

@Kh4ster Kh4ster requested review from Kh4ster and removed request for chris-maes and rg20 June 2, 2026 12:49
@github-actions

Copy link
Copy Markdown

🔔 Hi @anandhkb, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you!

If this is an "epic" issue, then please add the "epic" label to this issue.
If it is a PR and not ready for review, then please convert this to draft.
If you just want to switch off this notification, then use the "skip inactivity reminder" label.

@Bubullzz Bubullzz added improvement Improves an existing functionality and removed do not merge Do not merge if this flag is set labels Jun 11, 2026
@Bubullzz

Copy link
Copy Markdown
Contributor Author

/ok to test 036469d

@Bubullzz

Copy link
Copy Markdown
Contributor Author

/ok to test 520218a

@Bubullzz Bubullzz added the non-breaking Introduces a non-breaking change label Jun 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improves an existing functionality non-breaking Introduces a non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants