Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 25 additions & 2 deletions src/hyperlight_wasm/src/sandbox/wasm_sandbox.rs
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ pub struct WasmSandbox {
// Snapshot of state of an initial WasmSandbox (runtime loaded, but no guest module code loaded).
// Used for LoadedWasmSandbox to be able restore state back to WasmSandbox
snapshot: Option<Arc<Snapshot>>,
needs_restore: bool,
}

const MAPPED_BINARY_VA: u64 = 0x1_0000_0000u64;
Expand All @@ -56,6 +57,7 @@ impl WasmSandbox {
Ok(WasmSandbox {
inner: Some(inner),
snapshot: Some(snapshot),
needs_restore: false,
})
}

Expand All @@ -64,24 +66,33 @@ impl WasmSandbox {
/// the snapshot has already been created in that case.
/// Expects a snapshot of the state where wasm runtime is loaded, but no guest module code is loaded.
pub(super) fn new_from_loaded(
mut loaded: MultiUseSandbox,
loaded: MultiUseSandbox,
snapshot: Arc<Snapshot>,
) -> Result<Self> {
loaded.restore(snapshot.clone())?;
metrics::gauge!(METRIC_ACTIVE_WASM_SANDBOXES).increment(1);
metrics::counter!(METRIC_TOTAL_WASM_SANDBOXES).increment(1);
Ok(WasmSandbox {
inner: Some(loaded),
snapshot: Some(snapshot),
needs_restore: true,
})
}

fn restore_if_needed(&mut self) -> Result<()> {
if self.needs_restore {
self.inner.as_mut().ok_or(new_error!("WasmSandbox is none"))?.restore(self.snapshot.as_ref().ok_or(new_error!("Snapshot is none"))?.clone())?;
self.needs_restore = false;
}
Ok(())
}
Comment on lines +81 to +87
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new_from_loaded() now defers restoring the runtime snapshot via needs_restore, but load_module_from_buffer() does not call restore_if_needed(). This can leave the sandbox in the post-module/poisoned state when loading from an in-memory buffer (e.g., after LoadedWasmSandbox::unload_module()), leading to incorrect module load behavior. Ensure all module-load entry points (including load_module_from_buffer) restore when needs_restore is set, or restore eagerly in new_from_loaded().

Copilot uses AI. Check for mistakes.

/// Load a Wasm module at the given path into the sandbox and return a `LoadedWasmSandbox`
/// able to execute code in the loaded Wasm Module.
///
/// Before you can call guest functions in the sandbox, you must call
/// this function and use the returned value to call guest functions.
pub fn load_module(mut self, file: impl AsRef<Path>) -> Result<LoadedWasmSandbox> {
self.restore_if_needed()?;
let inner = self
.inner
.as_mut()
Expand All @@ -97,6 +108,17 @@ impl WasmSandbox {
self.finalize_module_load()
}

/// Load a Wasm module by restoring a Hyperlight snapshot taken
/// from a `LoadedWasmSandbox`.
pub fn load_from_snapshot(mut self, snapshot: Arc<Snapshot>) -> Result<LoadedWasmSandbox> {
let mut sb = self.inner.take().unwrap();
sb.restore(snapshot)?;
LoadedWasmSandbox::new(
sb,
self.snapshot.take().unwrap(),
Comment on lines +114 to +118
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

load_from_snapshot() uses unwrap() on self.inner and self.snapshot, which will panic if the sandbox is in an unexpected state. Prefer returning a structured Err (consistent with other methods in this file) so callers get a recoverable error rather than a process abort.

Suggested change
let mut sb = self.inner.take().unwrap();
sb.restore(snapshot)?;
LoadedWasmSandbox::new(
sb,
self.snapshot.take().unwrap(),
let mut sb = self.inner.take().ok_or(new_error!("WasmSandbox is none"))?;
sb.restore(snapshot)?;
let base_snapshot = self.snapshot.take().ok_or(new_error!("Snapshot is none"))?;
LoadedWasmSandbox::new(
sb,
base_snapshot,

Copilot uses AI. Check for mistakes.
)
Comment on lines +114 to +119
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

load_from_snapshot() constructs LoadedWasmSandbox directly, bypassing finalize_module_load() and therefore skipping METRIC_SANDBOX_LOADS increment. If this path is a “load”, it should update the same metrics as other load paths (or explicitly document/rename if it’s intended to be excluded).

Suggested change
let mut sb = self.inner.take().unwrap();
sb.restore(snapshot)?;
LoadedWasmSandbox::new(
sb,
self.snapshot.take().unwrap(),
)
let sb = self
.inner
.as_mut()
.ok_or_else(|| new_error!("WasmSandbox is None"))?;
sb.restore(snapshot)?;
self.finalize_module_load()

Copilot uses AI. Check for mistakes.
Comment on lines +114 to +119
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New public API load_from_snapshot() is introduced but there are no tests covering its expected behavior (e.g., snapshot taken from a LoadedWasmSandbox, restore via WasmSandbox::load_from_snapshot, then successful guest calls / proper unload back to runtime snapshot). Since this module already has unit tests, adding coverage here would help prevent regressions in the snapshot-based CoW workflow.

Suggested change
let mut sb = self.inner.take().unwrap();
sb.restore(snapshot)?;
LoadedWasmSandbox::new(
sb,
self.snapshot.take().unwrap(),
)
let mut sb = self
.inner
.take()
.ok_or_else(|| new_error!("WasmSandbox is None"))?;
sb.restore(snapshot)?;
let runtime_snapshot = self
.snapshot
.take()
.ok_or_else(|| new_error!("Runtime snapshot is None"))?;
LoadedWasmSandbox::new(sb, runtime_snapshot)

Copilot uses AI. Check for mistakes.
}

/// Load a Wasm module that is currently present in a buffer in
/// host memory, by mapping the host memory directly into the
/// sandbox.
Expand All @@ -114,6 +136,7 @@ impl WasmSandbox {
base: *mut libc::c_void,
len: usize,
) -> Result<LoadedWasmSandbox> {
self.restore_if_needed()?;
let inner = self
.inner
.as_mut()
Expand Down
87 changes: 75 additions & 12 deletions src/hyperlight_wasm_runtime/src/platform.rs
Original file line number Diff line number Diff line change
Expand Up @@ -172,32 +172,95 @@ pub extern "C" fn wasmtime_init_traps(handler: wasmtime_trap_handler_t) -> i32 {
0
}

// The wasmtime_memory_image APIs are not yet supported.
// Copy a VA range to a new VA. Old and new VA, and len, must be
// page-aligned.
fn copy_va_mapping(base: *const u8, len: usize, to_va: *mut u8, remap_original: bool) {
// TODO: make this more efficient by directly exposing the ability
// to traverse an entire VA range in
// hyperlight_guest_bin::paging::virt_to_phys, and coalescing
// continuous ranges there.
let base_u = base as u64;
let va_page_bases = (base_u..(base_u + len as u64)).step_by(vmem::PAGE_SIZE);
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copy_va_mapping() relies on base, to_va, and len being page-aligned, but it does not enforce this at runtime. Since this is low-level mapping code (and the functions are extern "C"-reachable), add debug_assert!/checks for alignment (and potentially len != 0) to fail fast and avoid silent mis-mapping if the contract is violated.

Suggested change
let va_page_bases = (base_u..(base_u + len as u64)).step_by(vmem::PAGE_SIZE);
let to_va_u = to_va as u64;
let len_u = len as u64;
let page_size = vmem::PAGE_SIZE as u64;
// Enforce the documented contract: base, to_va, and len must be
// page-aligned, and len must be nonzero.
debug_assert!(len_u != 0, "copy_va_mapping: len must be nonzero");
debug_assert_eq!(
base_u % page_size,
0,
"copy_va_mapping: base must be page-aligned"
);
debug_assert_eq!(
to_va_u % page_size,
0,
"copy_va_mapping: to_va must be page-aligned"
);
debug_assert_eq!(
len_u % page_size,
0,
"copy_va_mapping: len must be a multiple of the page size"
);
if len_u == 0
|| base_u % page_size != 0
|| to_va_u % page_size != 0
|| len_u % page_size != 0
{
panic!("copy_va_mapping called with non page-aligned arguments or zero length");
}
let va_page_bases = (base_u..(base_u + len_u)).step_by(vmem::PAGE_SIZE);

Copilot uses AI. Check for mistakes.
let mappings = va_page_bases.flat_map(paging::virt_to_phys);
for mapping in mappings {
// TODO: Deduplicate with identical logic in hyperlight_host snapshot.
let (new_kind, was_writable) = match mapping.kind {
// Skip unmapped pages, since they will be unmapped in
// both the original and the new copy
vmem::MappingKind::Unmapped => continue,
vmem::MappingKind::Basic(bm) if bm.writable => (vmem::MappingKind::Cow(vmem::CowMapping {
readable: bm.readable,
executable: bm.executable,
}), true),
vmem::MappingKind::Basic(bm) => (vmem::MappingKind::Basic(vmem::BasicMapping {
readable: bm.readable,
writable: false,
executable: bm.executable,
}), false),
vmem::MappingKind::Cow(cm) => (vmem::MappingKind::Cow(cm), false),
};
if remap_original && was_writable {
// If necessary, remap the original page as Cow, instead
// of whatever it is now, to ensure that any more writes to
// that region do not change the image base.
//
// TODO: could the table traversal needed for this be fused
// with the table traversal that got the original mapping,
// above?
unsafe {
paging::map_region(
mapping.phys_base,
mapping.virt_base as *mut u8,
vmem::PAGE_SIZE as u64,
new_kind,
);
}
}
// map the same pages to the new VA
unsafe {
paging::map_region(
mapping.phys_base,
to_va.wrapping_add((mapping.virt_base - base_u) as usize),
vmem::PAGE_SIZE as u64,
new_kind,
);
}
}
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After modifying page tables in copy_va_mapping() via paging::map_region, there is no barrier/TLB synchronization call (contrast with map_buffer() which calls paging::barrier::first_valid_same_ctx()). If required on this platform, missing the barrier could lead to stale translations when Wasmtime immediately touches the mapped pages; consider adding the appropriate barrier after the mapping loop.

Suggested change
}
}
// Ensure that all page table updates performed above are visible
// before returning, to avoid stale translations when the new
// mappings are immediately accessed.
unsafe {
paging::barrier::first_valid_same_ctx();
}

Copilot uses AI. Check for mistakes.
}

// Create a copy-on-write memory image from some existing VA range.
// `ptr` and `len` must be page-aligned (which is guaranteed by the
// wasmtime-platform.h interface).
#[no_mangle]
pub extern "C" fn wasmtime_memory_image_new(
_ptr: *const u8,
_len: usize,
ptr: *const u8,
len: usize,
ret: &mut *mut c_void,
) -> i32 {
*ret = core::ptr::null_mut();
// Choose an arbitrary VA, which we will use as the memory image
// identifier. We will construct the image by mapping a copy of
// the original VA range here, making the original copy CoW as we
// go.
let new_virt = FIRST_VADDR.fetch_add(0x100_0000_0000, Ordering::Relaxed) as *mut u8;
copy_va_mapping(ptr, len, new_virt, true);
*ret = new_virt as *mut c_void;
0
}

#[no_mangle]
pub extern "C" fn wasmtime_memory_image_map_at(
_image: *mut c_void,
_addr: *mut u8,
_len: usize,
image: *mut c_void,
addr: *mut u8,
len: usize,
) -> i32 {
/* This should never be called because wasmtime_memory_image_new
* returns NULL */
panic!("wasmtime_memory_image_map_at");
copy_va_mapping(image as *mut u8, len, addr, false);
0
}

#[no_mangle]
pub extern "C" fn wasmtime_memory_image_free(_image: *mut c_void) {
/* This should never be called because wasmtime_memory_image_new
* returns NULL */
/* This should never be called in practice, because we simple
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in comment: "because we simple" -> "because we simply".

Suggested change
/* This should never be called in practice, because we simple
/* This should never be called in practice, because we simply

Copilot uses AI. Check for mistakes.
* restore the snapshot rather than actually unload/destroy instances */
panic!("wasmtime_memory_image_free");
Comment on lines +262 to 264
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wasmtime_memory_image_new() now returns a non-null image handle, which makes it likely that Wasmtime will eventually call wasmtime_memory_image_free(). Currently wasmtime_memory_image_free() always panics, which can crash the guest in normal operation. Implement a real free/unmap for the image VA range (or redesign to return NULL/feature-gate) so dropping images is safe.

Suggested change
/* This should never be called in practice, because we simple
* restore the snapshot rather than actually unload/destroy instances */
panic!("wasmtime_memory_image_free");
/* Currently, we do not unmap or destroy memory images explicitly.
* This function is intentionally a no-op so that it is safe for
* Wasmtime to call during normal cleanup without crashing the guest. */

Copilot uses AI. Check for mistakes.
}

Expand Down
Loading