Skip to content

Commit 6c451d9

Browse files
ryanbreenclaude
andauthored
feat: MSI-X interrupt-driven GPU wake + zero-spin SUBMIT_3D (#263)
* feat: MSI-X interrupt-driven GPU wake + zero-spin SUBMIT_3D + frame pacing Three changes that together reduce BWM kernel CPU from ~20% to ~10%: 1. MSI-X interrupt-driven GPU wake: GPU completion fires MSI-X interrupt which immediately unblocks the compositor thread via scheduler. Matches Linux virtio-gpu driver pattern (wait_event_timeout). Zero spin cycles wasted waiting for SUBMIT_3D completion. 2. Zero-spin path: When MSI-X is active, send_command_3desc blocks the thread immediately after notifying the device (no 5k spin warmup). The interrupt handler wakes the thread at precisely 3.4ms. Polling fallback with spin+timer preserved for non-MSI-X configurations. 3. Frame pacing in compositor_wait: Enforces 5ms minimum inter-frame interval to prevent the compositor from saturating the CPU when GPU wake is fast. Uses plain timer block (not compositor block) to avoid consuming dirty-wake signals meant for the main blocking section. Performance (Parallels, bounce demo running): - Kernel GPU busy: ~16% (was 20% with spin, 70% before blocking) - submit_cpu: ~700us (real VirGL work, zero spin waste) - sleep: ~2600us (72% of frame time truly idle) - FPS: ~280 (not artificially capped) - 23/23 bcheck tests passing Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: pass z-order from BWM to kernel for correct GPU compositing order Windows were drawn in slot order rather than z-order, causing window decorations to appear behind other windows' content. BWM now passes z_order (vec index) via set_window_position op=17, and the kernel sorts WindowCompositeInfo by z_order before GPU draws back-to-front. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: correct CPU tick accounting for blocked threads + BWM optimizations **Scheduler tick accounting (kernel/src/task/scheduler.rs):** When a thread blocked in a syscall (sleep, waitpid, compositor_wait, GPU command wait), the scheduler kept charging elapsed ticks as CPU time until the next context switch. This caused all blocking processes to report inflated CPU% (e.g., BWM showed 40% when actual CPU was 10%). Fix: charge ticks at block time in all block_current_* functions, and skip tick charging in schedule() for already-blocked threads. This affects every process in Breenix — any blocking syscall now correctly reports only active CPU time. **BWM optimizations (userspace/programs/src/bwm.rs):** - Replace per-pixel fill_rect with libgfx::shapes::fill_rect (3x faster, pre-clips once then writes raw bytes with no per-pixel bounds checks) - Remove wasted content area fill in draw_window_frame (GPU overwrites it) - Skip redundant full redraw when clicking already-focused/top window - Reduce now_realtime() syscalls from every frame to every 30th frame Result: BWM CPU 40% → 10% (accounting fix + reduced userspace work). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
1 parent e3f2942 commit 6c451d9

6 files changed

Lines changed: 534 additions & 134 deletions

File tree

kernel/src/drivers/mod.rs

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -122,6 +122,10 @@ pub fn init() -> usize {
122122
serial_println!("[drivers] VirGL 3D acceleration active");
123123
crate::graphics::set_compositor_backend(crate::graphics::CompositorBackend::VirGL);
124124
serial_println!("[drivers] Compositor backend: VirGL");
125+
// Enable yielding during GPU command waits now that init is complete.
126+
// During runtime, GPU poll loops yield to the scheduler instead of
127+
// spinning, letting other tasks run during ~3.4ms GPU processing.
128+
virtio::gpu_pci::enable_gpu_yield();
125129
}
126130
Err(e) => serial_println!("[drivers] VirGL init skipped: {}", e),
127131
}

0 commit comments

Comments
 (0)