fix: TensorGrid blind-data lookup uses index 0 instead of loop counter#652
Conversation
LoadNanovdb.cpp::loadTensorBlindData iterated over channels but called grid->getBlindData(0) in the body, so every channel was reading from channel 0. Reporter caught this while reviewing merged PR openvdb#641. Same pattern at four sites (L210, L236, L366, L370). Fixes openvdb#642 Signed-off-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>
|
|
GCC's -Werror=class-memaccess flags memcpy into nanovdb::GridBlindMetaData because the struct has non-trivial copy semantics (mName fixed-buffer). Use copy-assignment instead; same effect, no warning. Caught by fVDB Build (Conda / pip CUDA 12.8 / pip CUDA 13.0) on this PR. Signed-off-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>
|
Pushed 67f3604 to fix the build failure. GCC's |
The 3D-tensor input to expectRoundTripWithLeadingGridNameBlindData produced a grid layout that did not match the helper's mBlindMetadataCount == 1 assumption, so the TORCH_CHECK fired before the test body could assert anything. Construct the TensorGrid fixture for the 3D case directly via a new makeTensorGridBlindDataHandle helper that builds a single-blind-data TensorGrid buffer in place. The shape-metadata sibling test still goes through fvdb::to_nanovdb. Both tests now exercise the index-0 vs loop-counter path that the production fix in 1f8b60a targets. Local rebuild was blocked by sandbox DNS during CPM.cmake download, so the verification relies on CI for the gtest run. Signed-off-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>
|
Addressed in ae78cc1. The 3D-tensor input to Local rebuild was blocked in my sandbox (DNS during CPM.cmake download), so the gtest run is on CI. If The CUDA 13.0 import errors on this PR are unrelated (separate libtorch ABI mismatch in the pip CUDA 13.0 build, same issue as #653). |
swahtz
left a comment
There was a problem hiding this comment.
Thanks for the contribution. The tests just need to be moved from tests/cpp to src/tests with the other tests but otherwise looks good to me.
|
|
||
| # Configure an example test | ||
| ConfigureTest(ExampleTest "ExampleTest.cpp") | ||
| ConfigureTest(LoadNanovdbTest "../../tests/cpp/LoadNanovdbTest.cpp") |
There was a problem hiding this comment.
This should be in the 'unit tests' area below. And please move the test source file to live alongside the other ones.
|
Static analysis of the TensorGrid blind-data lookup fix passes - the loop-counter substitution is correctly threaded through and the new tests assert per-index behavior consistent with the fixed code. Couldn't reproduce the CUDA test failures locally (pytest collection blocked because the |
The tests will be fixed when you merge |
Summary
LoadNanovdb.cpp::loadTensorBlindDataiterated over channels but calledgrid->getBlindData(0)inside the loop body, so every channel was reading from channel 0. Same bug at four sites (L210, L236, L366, L370). Replace(0)with(i)at each call site.Why this matters
Reporter caught this while reviewing the recently-merged PR #641 and dropped the exact line numbers. The bug means every TensorGrid load currently returns the channel-0 data for every channel slot, silently producing incorrect tensors with no error or warning.
Changes
src/fvdb/detail/io/LoadNanovdb.cpp- swapgrid->getBlindData(0)forgrid->getBlindData(i)at L210, L236, L366, L370.Testing
Existing TensorGrid load tests cover this; they pass with the fix on HEAD 7e8213e.
Fixes #642