refactor: return const references from page accessor methods#826
refactor: return const references from page accessor methods#826polyominal wants to merge 1 commit intocmu-db:masterfrom
Conversation
There was a problem hiding this comment.
For the most part I think this is fine to add, with a few caveats that may or may not be relevant, as my C++ is rusty (which is intentionally a pun here as what I'll say is true for Rust).
In BusTub there are no checks to see if the keys and values are properly aligned in the leaf pages (and if that were true, then I think accessing the references would be UB). I think this is not a problem because the templates give the key and value arrays in the leaves proper padding implicitly (which I think is also a problem, but one for another day).
On top of that, if people start passing references to keys all over the place, and the buffer pool manager decides to evict that memory and replace it with something else, that could also be UB (unfortunately we do not have lifetimes for references in C++ so there is nothing preventing that).
Regardless, I think I agree with you that we should probably have these return data with lifetimes bound to the scope of the page they live on rather than the function caller.
Could you document this in the header files? And also I think this might need to change for the other indexes (extendible hash table) to be consistent.
|
Makes sense to have consistent interfaces with documentation changes. I'll look into this after I finish having fun with the projects.
I'm also bothered by how easy it is to create UAF in C++. A direct example in the BusTub context: char *hacked = std::invoke([&] {
WritePageGuard guard = bpm->WritePage(pid);
return guard.AsMut<char>();
});
snprintf(hacked, 3, "hi");In C++ it seems safety for such cases relies largely on user discipline and tooling, etc., which contrasts with rusty (the 🦀 sense) approaches. Now that I think of it, the iterator I'll continue working on the projects and see what feels right. |
Common sense:
is a strictly better interface than
(I thought it was obvious that these methods should return values whose lifetime is bound to the page object.)
FYI this is largely motivated by troubleshooting UB from writing something like
in the B+tree iterator implementation 🫠.