ARM64 network drivers (VirtIO net PCI on Parallels, e1000 on VMware) rely on timer-based polling at 100Hz (every 10ms). This adds 5-10ms latency per network round-trip, which compounds across DNS, TCP handshake, and HTTP response phases. On x86, the e1000 has a proper IRQ 11 handler that processes packets immediately via softirq.
Replace timer-based polling with interrupt-driven packet processing on ARM64, achieving sub-millisecond packet delivery latency.
All infrastructure already exists and is proven working:
- GIC driver (
gic.rs):enable_spi(),disable_spi(),configure_spi_edge_triggered(),clear_spi_pending()— all present - PCI driver (
pci.rs):find_msi_capability(),configure_msi(),disable_intx()— all present - GICv2m MSI (
platform_config.rs):probe_gicv2m(),allocate_msi_spi()— already used by xHCI and GPU PCI drivers on Parallels - net_pci.rs already has
handle_interrupt()(line 552) that reads ISR and raises NetRx softirq — it's just never called from the interrupt path
Add MSI setup following the exact pattern from xhci.rs:setup_xhci_msi():
static NET_PCI_IRQ: AtomicU32 = AtomicU32::new(0);
pub fn get_irq() -> Option<u32> {
let irq = NET_PCI_IRQ.load(Ordering::Relaxed);
if irq != 0 { Some(irq) } else { None }
}
fn setup_net_pci_msi(pci_dev: &pci::Device) -> Option<u32> {
// 1. Find MSI capability (cap ID 0x05)
let cap_offset = pci_dev.find_msi_capability()?;
// 2. Probe GICv2m (already probed by xHCI, returns cached value)
let gicv2m_base = platform_config::gicv2m_base_phys()?;
// 3. Allocate SPI from GICv2m pool
let spi = platform_config::allocate_msi_spi()?;
// 4. Program MSI: address = GICv2m doorbell, data = SPI number
pci_dev.configure_msi(cap_offset, gicv2m_base + 0x40, spi);
// 5. Disable INTx (MSI replaces it)
pci_dev.disable_intx();
// 6. Configure GIC: edge-triggered, enable SPI
gic::configure_spi_edge_triggered(spi);
gic::enable_spi(spi);
Some(spi)
}In init(), after device setup: call setup_net_pci_msi(), store result in
NET_PCI_IRQ.
Update handle_interrupt() with disable/clear/ack/enable SPI pattern (matching
the xHCI and GPU handlers):
pub fn handle_interrupt() {
let irq = NET_PCI_IRQ.load(Ordering::Relaxed);
if irq != 0 {
gic::disable_spi(irq);
gic::clear_spi_pending(irq);
}
// Read ISR status register (existing code — auto-acks on read for legacy VirtIO)
// Raise NetRx softirq (existing code)
if irq != 0 {
gic::enable_spi(irq);
}
}Add dispatch entry in the SPI match arm (32..=1019), alongside existing GPU PCI handler:
if let Some(net_pci_irq) = crate::drivers::virtio::net_pci::get_irq() {
if irq_id == net_pci_irq {
crate::drivers::virtio::net_pci::handle_interrupt();
}
}Conditionalize polling — only poll when no MSI IRQ is configured:
if !crate::drivers::virtio::net_pci::get_irq().is_some()
&& (net_pci::is_initialized() || e1000::is_initialized())
&& _count % 10 == 0
{
raise_softirq(SoftirqType::NetRx);
}- DNS resolution should complete in <200ms (was 4-5 seconds)
- HTTP fetch should complete in <2 seconds (was 10 seconds)
cat /proc/interruptsor trace counters should show NIC interrupts firing
VMware Fusion uses GICv3 with ITS (Interrupt Translation Service), not GICv2m. This is a different MSI delivery mechanism.
The ITS provides MSI translation for GICv3 systems:
-
Discover ITS: Parse ACPI MADT for ITS entry, or scan GIC redistributor space. ITS is typically at a well-known address (e.g., 0x0801_0000 on VMware virt).
-
Initialize ITS:
- Allocate command queue (4KB aligned, mapped uncacheable)
- Allocate device table and collection table
- Enable ITS via GITS_CTLR
-
Per-device setup:
MAPDcommand: map device ID to interrupt tableMAPTIcommand: map event ID to LPI (physical interrupt)MAPIcommand: map interrupt to collection (target CPU)INVcommand: invalidate cached translation
-
MSI configuration:
- MSI address =
GITS_TRANSLATERphysical address - MSI data = device-specific event ID
- Program via
pci_dev.configure_msi(cap, its_translater, event_id)
- MSI address =
-
IRQ handling: LPIs are delivered via GICv3 ICC_IAR1_EL1, same as SPIs. Dispatch by LPI number in exception.rs.
Estimated effort: 200-400 lines of new code for ITS initialization + per-device setup. Most complex part is the command queue protocol.
Parse the ACPI DSDT for PCI interrupt routing:
-
Parse ACPI _PRT: The PCI Routing Table maps (slot, pin) -> GIC SPI. Breenix already has basic ACPI parsing for MADT/SPCR. Extend to parse DSDT for _PRT entries.
-
Configure SPI: Once the SPI number is known from _PRT, configure it as level-triggered (INTx is level, not edge), enable in GIC.
-
Shared interrupt handling: INTx lines may be shared between devices. Handler must check each device's ISR before claiming the interrupt.
Estimated effort: 100-200 lines for _PRT parsing + level-triggered handler.
If VMware always maps e1000 INTx to a known SPI (discoverable from the device tree or hardcoded for the vmware-aarch64 machine model), we could:
- Read
interrupt_linefrom PCI config space (currently 0xFF on ARM64) - Use VMware's DT to find the actual SPI mapping
- Hardcode the mapping as a platform quirk if it's stable
Estimated effort: 20-50 lines, but fragile.
Start with Approach B (_PRT parsing) since ACPI infrastructure partially exists. Defer ITS to Phase 3 when multiple PCI devices need independent MSI vectors.
Replace the chain of if let Some(irq) in exception.rs with a registration-
based dispatch:
static PCI_IRQ_HANDLERS: Mutex<[(u32, fn()); 16]>;
pub fn register_pci_irq(spi: u32, handler: fn()) { ... }This allows any PCI driver to register its own handler without modifying exception.rs.
For GICv3 platforms (VMware, newer QEMU configs, real hardware):
- ITS command queue management
- LPI configuration tables (PROPBASER, PENDBASER)
- Per-device interrupt translation
- Multi-CPU interrupt routing via collections
QEMU virt machine maps PCI INTx to fixed SPIs:
- INTA -> SPI 3 (GIC INTID 35)
- INTB -> SPI 4 (GIC INTID 36)
- INTC -> SPI 5 (GIC INTID 37)
- INTD -> SPI 6 (GIC INTID 38)
With swizzling: actual_pin = (slot + pin - 1) % 4
These are level-triggered and shared, requiring ISR checks per device.
Timer interrupt (1000Hz)
-> every 10th tick: raise_softirq(NetRx)
-> net_rx_softirq_handler()
-> process_rx()
-> net_pci::receive() / e1000::receive()
-> process_packet()
-> udp::enqueue_packet() / tcp::handle_segment()
-> wake blocked thread
Latency: 0-10ms (mean 5ms) per packet.
NIC MSI interrupt -> GIC SPI
-> exception.rs handle_irq()
-> net_pci::handle_interrupt()
-> read ISR (auto-ack)
-> raise_softirq(NetRx)
-> net_rx_softirq_handler()
-> process_rx()
-> ... (same as above)
Latency: <100us per packet (GIC + softirq overhead).
Device writes MSI data to GICv2m doorbell address:
addr = GICV2M_BASE + 0x40 (MSI_SETSPI_NS)
data = allocated SPI number
GICv2m translates write to GIC SPI assertion.
GIC delivers SPI to target CPU via ICC_IAR1_EL1.