Improve the documentation around layout gurantees given by Box#155597
Improve the documentation around layout gurantees given by Box#155597weiznich wants to merge 1 commit into
Box#155597Conversation
|
rustbot has assigned @Mark-Simulacrum. Use Why was this reviewer chosen?The reviewer was selected based on:
|
| //! convert between `Box<A>` and `Box<B>` using unsafe code if its sound to do the same conversion | ||
| //! between `A` and `B`. | ||
| //! | ||
| //! For zero-sized values, the `Box` pointer has to be non-null and sufficiently aligned. The |
There was a problem hiding this comment.
The existing text has two paragraphs covering the two cases "non-zero-sized" and "zero-sized". Please don't break that structure by adding an entirely unrelated paragraph in the middle.
There was a problem hiding this comment.
I put it a bit further down. If that's still a bad location please suggest a better one.
| //! obtained from [`Box::<T>::into_raw`] may be deallocated using the [`Global`] allocator with | ||
| //! [`Layout::for_value(&*value)`]. | ||
| //! | ||
| //! It can be assumed that the layout of `Box<A>` is compatible to the layout of `Box<B>` if |
There was a problem hiding this comment.
I'm a bit concerned that we have never defined what "compatible layout" actually means. I don't think we are using it in existing docs, are we?
There was a problem hiding this comment.
What about going with "The layout is the same for Box<A> and Box<B> if it's the same for A and B instead?
There was a problem hiding this comment.
That uses a different word but doesn't fix the problem that we haven't actually defined the terms you are using here.
Fundamentally the problem is that we have basically no documentation of the sort you want to add, and adding the first pieces of such documentation needs a lot of infrastructure (like term definitions) to be established. Cc @traviscross @ehuss
Also it occurs to me that it seems odd to make this guarantee for Box but not for raw pointers or references.
There was a problem hiding this comment.
This is using the wording out of the nomicon:
When transmuting between different compound types, you have to make sure they are laid out the same way! If layouts differ, the wrong fields are going to get filled with the wrong data, which will make you unhappy and can also be Undefined Behavior (see above).
So how do you know if the layouts are the same? For repr(C) types and repr(transparent) types, layout is precisely defined. But for your run-of-the-mill repr(Rust), it is not. Even different instances of the same generic type can have wildly different layout. Vec and Vec might have their fields in the same order, or they might not. The details of what exactly is and is not guaranteed for data layout are still being worked out over at the UCG WG.
For me that reads like there is some existing usage of "the same layout" defined there, so I'm rather surprised to hear that "same layout" is not defined yet.
There was a problem hiding this comment.
That's sad, given that this is linked from the docs of transmute. Maybe it that link should be removed from there is it's not something you are allowed to rely on.
Otherwise, given that all this above sounds like quite a lot of discussion that will be required and I do not have the capacity to drive this discussion forward I wondered if there is an other way to get the guarantee I'm looking for. So what's for example about just adding a note to transmute itself stating that transmuting from Box<A> to Box<B> is fine if transmuting A to B is fine? Maybe even extend it to references?
That would get the necessary guarantee in the docs without needing to introduce a lot new vocabulary.
There was a problem hiding this comment.
Someone will have to do that work some day if we want to get proper unsafe docs. Rust is developed by volunteers stepping up to tasks like that, the teams usually don't have the capacity to do such work themselves. But I can understand that it may be a bit daunting and that you also may not have the capacity.
Probably the easiest thing to say is something that compares transmuting Box with other operations.
- if a pointer-to-pointer cast from
*mut Tto*mut Uis allowed, and furthermoreTandUhave the same unsized tail, then the transmute between those two types behaves the same as the cast. (That's a raw pointer guarantee. Not sure where the best place for this is.) - and then for the same pairs of types
T,Uwe can say that a transmute ofBoxis also allowed and that it acts like invokinginto_raw, then casting the raw pointer, and then callingfrom_raw.
There was a problem hiding this comment.
I looked into more documentation and it seems like the reference does use the combination "the same layout" as well when it describes #[repr(transparent)]. It also defines type layout, although it doesn't go in detail what "same layout" exactly mean. On the other hand I wonder how relevant it is to define these details, as the relevant fact is that the layouts are "the same" and that on it's own can be stated as fact for certain combinations (like the #[repr(transparent)] case, or types with an otherwise known layout). For the Box<A> -> Box<B> case this might be sufficient, as it's very similar to #[repr(transparent)] or other pointer casts?
Also the linked page states that the type layout "is its size, alignment, and the relative offsets of its fields" (and the discriminant for enums). So it shouldn't be too hard to clarify there that same layout means that all of this is guaranteed to match. It's then the responsibility of each representation (or each specific type like box) to actually document that these guarantees are gives (or not).
And just to be sure as you noted that the Nomicon is not normative: I assume that the reference is normative?
There was a problem hiding this comment.
Yes, the reference is normative. :)
The problem with "same layout" is that on its own it still doesn't say what happens on a transmute. Somehow a value of the old type changes to a value of the new type. How? You can't say "it's the same value" as the values don't even have the same type so that's a meaningless statement. For repr(transparent) it's kind of obvious what to do, but for most other cases here that would take much more care to describe.
That's why I proposed focusing on saying "a transmute is equivalent to the following sequence of operations".
There was a problem hiding this comment.
Thanks for the clarifications. I finally found some time to go over the wording again. As suggested the added documentation now explicitly states that this is equivalent to the pointer cast. I also added some additional sentence to highlight that also the requirements of transmute need to be satisfied, which mostly boils down to matching size, alignment and validity of values. These requirements are directly taken from the transmute docs, so I assume that they are nominative?
Hopefully that now better fits the existing documentation?
b17e07b to
ad7e61c
Compare
| //! It can be assumed that the layout of `Box<A>` is the same as the layout of `Box<B>` if | ||
| //! the layout of `A` is the same as the layout of `B` otherwise. It is therefore sound to | ||
| //! convert between `Box<A>` and `Box<B>` using unsafe code if its sound to do the same conversion | ||
| //! between `A` and `B`. | ||
| //! |
There was a problem hiding this comment.
I think this is too strong. Notably, e.g. transmute between A and B being OK does not mean that transmute between Box<A> and Box<B> is OK: if the two types have different alignment requirements, transmute is OK, but by-pointer transmute is not.
We already guarantee below that Box<T> is ABI compatible with *const T for T: Sized, maybe we can add a sentence to that wording about Box<T> being #[repr(transparent)] for NonNull<T> which should give the right sense of what guarantees users can expect of the type?
ad7e61c to
11797d0
Compare
This comment has been minimized.
This comment has been minimized.
11797d0 to
50953b7
Compare
This comment has been minimized.
This comment has been minimized.
50953b7 to
7d6d433
Compare
This comment has been minimized.
This comment has been minimized.
This change extends the memory layout section in the documentation of `Box` to explicitly state that it is sound to convert between `Box<A>` and `Box<B>` as long as a cast between `*mut A` and `*mut B` is valid and the general requirements of `transmute` are satisfied. See https://rust-lang.zulipchat.com/#narrow/channel/136281-t-opsem/topic/Is.20transmuting.20between.20.60Box.3CA.3E.60.20and.20.60Box.3CB.3E.60.20UB.3F/with/585350243 for the relevant discussion.
7d6d433 to
b21dc98
Compare
This change extends the memory layout section in the documentation of
Boxto explicitly state that it is sound to convert betweenBox<A>andBox<B>as long a cast between*mut Aand*mut Bis valid andthe general requirements of
transmuteare satisfied.See
https://rust-lang.zulipchat.com/#narrow/channel/136281-t-opsem/topic/Is.20transmuting.20between.20.60Box.3CA.3E.60.20and.20.60Box.3CB.3E.60.20UB.3F/with/585350243 for the relevant discussion.