It has been a long-established principle that memory which is "inbounds" of a Rust allocation must not cause side-effects for reads and writes (other than the obvious ones) -- in particular, it must not abort execution. This is required because the optimizer will happily reorder accesses and move accesses to "dereferenceable" pointers out of conditionals or loops with no regard for other side-effects. Slightly more precisely, what this means is that when adding an allocation to the Rust AM (e.g. as part of the story for an asm block or FFI call), all memory inside that allocation has to be mapped, or else this operation is UB.
However, last year we had multiple discussions about the possibility of weakening this requirement. The core observation is that the worst thing that can happen with such accesses is that the entire program gets aborted, which is "safe". So it seems plausible to say that when adding an allocation to the AM, it's okay to have some of its pages not mapped, but if you do that then your program may abort at any time without any reason whatsoever, i.e., even if the program ostensibly never accessed the offending memory, execution can still abort "randomly". This caveat is required to account for segfaults caused by reordered or speculated accesses.
This could also be helpful for file-backed mmap which is always at risk of some other process truncating the file and thereby unmapping a bunch of the pages that the current process may consider to be inside a Rust AM allocation.
This came up recently again in the context of SGX where apparently even the pages holding the program text can be unmapped at any time by the untrusted code and somehow that's supposed to not be UB. Program text isn't a concept that exists inside the AM so this would require slightly extending the idea, but the basic principle remains the same.
At the moment I am not aware of anything wrong with that approach, but we should certainly get it blessed by LLVM before officially telling Rust programmers that they can do this. Cc @nikic
It has been a long-established principle that memory which is "inbounds" of a Rust allocation must not cause side-effects for reads and writes (other than the obvious ones) -- in particular, it must not abort execution. This is required because the optimizer will happily reorder accesses and move accesses to "dereferenceable" pointers out of conditionals or loops with no regard for other side-effects. Slightly more precisely, what this means is that when adding an allocation to the Rust AM (e.g. as part of the story for an asm block or FFI call), all memory inside that allocation has to be mapped, or else this operation is UB.
However, last year we had multiple discussions about the possibility of weakening this requirement. The core observation is that the worst thing that can happen with such accesses is that the entire program gets aborted, which is "safe". So it seems plausible to say that when adding an allocation to the AM, it's okay to have some of its pages not mapped, but if you do that then your program may abort at any time without any reason whatsoever, i.e., even if the program ostensibly never accessed the offending memory, execution can still abort "randomly". This caveat is required to account for segfaults caused by reordered or speculated accesses.
This could also be helpful for file-backed
mmapwhich is always at risk of some other process truncating the file and thereby unmapping a bunch of the pages that the current process may consider to be inside a Rust AM allocation.This came up recently again in the context of SGX where apparently even the pages holding the program text can be unmapped at any time by the untrusted code and somehow that's supposed to not be UB. Program text isn't a concept that exists inside the AM so this would require slightly extending the idea, but the basic principle remains the same.
At the moment I am not aware of anything wrong with that approach, but we should certainly get it blessed by LLVM before officially telling Rust programmers that they can do this. Cc @nikic