Skip to content

Commit d5ebb75

Browse files
authored
Merge pull request #260 from ryanbreen/feat/pci-msi-networking
feat: interrupt-driven VirtIO net PCI via GICv2m MSI-X
2 parents 507e815 + 97beb25 commit d5ebb75

17 files changed

Lines changed: 2634 additions & 376 deletions

File tree

build.rs

Lines changed: 1 addition & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -26,36 +26,26 @@ fn main() {
2626
boot_config.frame_buffer.minimum_framebuffer_height = Some(fb_height);
2727
disk_builder.set_boot_config(&boot_config);
2828

29-
println!("cargo:warning=Configured framebuffer: {}x{}", fb_width, fb_height);
30-
3129
// specify output paths
3230
let out_dir = PathBuf::from(env::var("OUT_DIR").unwrap());
3331
let uefi_path = out_dir.join("breenix-uefi.img");
3432
let bios_path = out_dir.join("breenix-bios.img");
3533

3634
// Only create the UEFI image by default. BIOS image can be enabled via env var.
37-
println!("cargo:warning=Creating UEFI disk image at {}", uefi_path.display());
3835
disk_builder
3936
.create_uefi_image(&uefi_path)
4037
.expect("failed to create UEFI disk image");
4138

4239
let build_bios = env::var("BREENIX_BUILD_BIOS").is_ok();
4340
if build_bios {
44-
println!(
45-
"cargo:warning=BREENIX_BUILD_BIOS set; creating BIOS disk image at {}",
46-
bios_path.display()
47-
);
4841
// New bootloader API removed BIOS builder; use UEFI image as placeholder to keep API surface stable.
4942
// If BIOS support is needed, switch to a branch that still exposes create_bios_image or vendor our own.
50-
println!("cargo:warning=bootloader no longer provides create_bios_image; duplicating UEFI image for BIOS placeholder");
5143
disk_builder
5244
.create_uefi_image(&bios_path)
5345
.expect("failed to create BIOS placeholder image");
54-
} else {
55-
println!("cargo:warning=Skipping BIOS image creation (BREENIX_BUILD_BIOS not set)");
5646
}
5747

5848
// pass the disk image paths via environment variables
5949
println!("cargo:rustc-env=UEFI_IMAGE={}", uefi_path.display());
6050
println!("cargo:rustc-env=BIOS_IMAGE={}", bios_path.display());
61-
}
51+
}
Lines changed: 267 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,267 @@
1+
# PCI MSI Interrupt-Driven Networking
2+
3+
## Problem
4+
5+
ARM64 network drivers (VirtIO net PCI on Parallels, e1000 on VMware) rely on
6+
timer-based polling at 100Hz (every 10ms). This adds 5-10ms latency per
7+
network round-trip, which compounds across DNS, TCP handshake, and HTTP
8+
response phases. On x86, the e1000 has a proper IRQ 11 handler that processes
9+
packets immediately via softirq.
10+
11+
## Goal
12+
13+
Replace timer-based polling with interrupt-driven packet processing on ARM64,
14+
achieving sub-millisecond packet delivery latency.
15+
16+
---
17+
18+
## Phase 1: VirtIO Net PCI MSI on Parallels (Priority: Immediate)
19+
20+
### Why This Is Easy
21+
22+
All infrastructure already exists and is proven working:
23+
- **GIC driver** (`gic.rs`): `enable_spi()`, `disable_spi()`,
24+
`configure_spi_edge_triggered()`, `clear_spi_pending()` — all present
25+
- **PCI driver** (`pci.rs`): `find_msi_capability()`, `configure_msi()`,
26+
`disable_intx()` — all present
27+
- **GICv2m MSI** (`platform_config.rs`): `probe_gicv2m()`,
28+
`allocate_msi_spi()` — already used by xHCI and GPU PCI drivers on Parallels
29+
- **net_pci.rs** already has `handle_interrupt()` (line 552) that reads ISR
30+
and raises NetRx softirq — it's just never called from the interrupt path
31+
32+
### Files to Modify
33+
34+
#### 1. `kernel/src/drivers/virtio/net_pci.rs`
35+
36+
Add MSI setup following the exact pattern from `xhci.rs:setup_xhci_msi()`:
37+
38+
```rust
39+
static NET_PCI_IRQ: AtomicU32 = AtomicU32::new(0);
40+
41+
pub fn get_irq() -> Option<u32> {
42+
let irq = NET_PCI_IRQ.load(Ordering::Relaxed);
43+
if irq != 0 { Some(irq) } else { None }
44+
}
45+
46+
fn setup_net_pci_msi(pci_dev: &pci::Device) -> Option<u32> {
47+
// 1. Find MSI capability (cap ID 0x05)
48+
let cap_offset = pci_dev.find_msi_capability()?;
49+
// 2. Probe GICv2m (already probed by xHCI, returns cached value)
50+
let gicv2m_base = platform_config::gicv2m_base_phys()?;
51+
// 3. Allocate SPI from GICv2m pool
52+
let spi = platform_config::allocate_msi_spi()?;
53+
// 4. Program MSI: address = GICv2m doorbell, data = SPI number
54+
pci_dev.configure_msi(cap_offset, gicv2m_base + 0x40, spi);
55+
// 5. Disable INTx (MSI replaces it)
56+
pci_dev.disable_intx();
57+
// 6. Configure GIC: edge-triggered, enable SPI
58+
gic::configure_spi_edge_triggered(spi);
59+
gic::enable_spi(spi);
60+
Some(spi)
61+
}
62+
```
63+
64+
In `init()`, after device setup: call `setup_net_pci_msi()`, store result in
65+
`NET_PCI_IRQ`.
66+
67+
Update `handle_interrupt()` with disable/clear/ack/enable SPI pattern (matching
68+
the xHCI and GPU handlers):
69+
70+
```rust
71+
pub fn handle_interrupt() {
72+
let irq = NET_PCI_IRQ.load(Ordering::Relaxed);
73+
if irq != 0 {
74+
gic::disable_spi(irq);
75+
gic::clear_spi_pending(irq);
76+
}
77+
// Read ISR status register (existing code — auto-acks on read for legacy VirtIO)
78+
// Raise NetRx softirq (existing code)
79+
if irq != 0 {
80+
gic::enable_spi(irq);
81+
}
82+
}
83+
```
84+
85+
#### 2. `kernel/src/arch_impl/aarch64/exception.rs`
86+
87+
Add dispatch entry in the SPI match arm (32..=1019), alongside existing GPU
88+
PCI handler:
89+
90+
```rust
91+
if let Some(net_pci_irq) = crate::drivers::virtio::net_pci::get_irq() {
92+
if irq_id == net_pci_irq {
93+
crate::drivers::virtio::net_pci::handle_interrupt();
94+
}
95+
}
96+
```
97+
98+
#### 3. `kernel/src/arch_impl/aarch64/timer_interrupt.rs`
99+
100+
Conditionalize polling — only poll when no MSI IRQ is configured:
101+
102+
```rust
103+
if !crate::drivers::virtio::net_pci::get_irq().is_some()
104+
&& (net_pci::is_initialized() || e1000::is_initialized())
105+
&& _count % 10 == 0
106+
{
107+
raise_softirq(SoftirqType::NetRx);
108+
}
109+
```
110+
111+
### Verification
112+
113+
- DNS resolution should complete in <200ms (was 4-5 seconds)
114+
- HTTP fetch should complete in <2 seconds (was 10 seconds)
115+
- `cat /proc/interrupts` or trace counters should show NIC interrupts firing
116+
117+
---
118+
119+
## Phase 2: E1000 MSI on VMware (Priority: Next)
120+
121+
VMware Fusion uses GICv3 with ITS (Interrupt Translation Service), not GICv2m.
122+
This is a different MSI delivery mechanism.
123+
124+
### Approach A: GICv3 ITS (Correct, Complex)
125+
126+
The ITS provides MSI translation for GICv3 systems:
127+
128+
1. **Discover ITS**: Parse ACPI MADT for ITS entry, or scan GIC redistributor
129+
space. ITS is typically at a well-known address (e.g., 0x0801_0000 on
130+
VMware virt).
131+
132+
2. **Initialize ITS**:
133+
- Allocate command queue (4KB aligned, mapped uncacheable)
134+
- Allocate device table and collection table
135+
- Enable ITS via GITS_CTLR
136+
137+
3. **Per-device setup**:
138+
- `MAPD` command: map device ID to interrupt table
139+
- `MAPTI` command: map event ID to LPI (physical interrupt)
140+
- `MAPI` command: map interrupt to collection (target CPU)
141+
- `INV` command: invalidate cached translation
142+
143+
4. **MSI configuration**:
144+
- MSI address = `GITS_TRANSLATER` physical address
145+
- MSI data = device-specific event ID
146+
- Program via `pci_dev.configure_msi(cap, its_translater, event_id)`
147+
148+
5. **IRQ handling**: LPIs are delivered via GICv3 ICC_IAR1_EL1, same as SPIs.
149+
Dispatch by LPI number in exception.rs.
150+
151+
**Estimated effort**: 200-400 lines of new code for ITS initialization + per-device
152+
setup. Most complex part is the command queue protocol.
153+
154+
### Approach B: INTx via ACPI _PRT (Simpler, Limited)
155+
156+
Parse the ACPI DSDT for PCI interrupt routing:
157+
158+
1. **Parse ACPI _PRT**: The PCI Routing Table maps (slot, pin) -> GIC SPI.
159+
Breenix already has basic ACPI parsing for MADT/SPCR. Extend to parse
160+
DSDT for _PRT entries.
161+
162+
2. **Configure SPI**: Once the SPI number is known from _PRT, configure it as
163+
level-triggered (INTx is level, not edge), enable in GIC.
164+
165+
3. **Shared interrupt handling**: INTx lines may be shared between devices.
166+
Handler must check each device's ISR before claiming the interrupt.
167+
168+
**Estimated effort**: 100-200 lines for _PRT parsing + level-triggered handler.
169+
170+
### Approach C: VMware-Specific Probe (Pragmatic)
171+
172+
If VMware always maps e1000 INTx to a known SPI (discoverable from the device
173+
tree or hardcoded for the vmware-aarch64 machine model), we could:
174+
175+
1. Read `interrupt_line` from PCI config space (currently 0xFF on ARM64)
176+
2. Use VMware's DT to find the actual SPI mapping
177+
3. Hardcode the mapping as a platform quirk if it's stable
178+
179+
**Estimated effort**: 20-50 lines, but fragile.
180+
181+
### Recommendation
182+
183+
Start with Approach B (_PRT parsing) since ACPI infrastructure partially exists.
184+
Defer ITS to Phase 3 when multiple PCI devices need independent MSI vectors.
185+
186+
---
187+
188+
## Phase 3: Generic PCI Interrupt Framework (Priority: Future)
189+
190+
### Dynamic IRQ Dispatch Table
191+
192+
Replace the chain of `if let Some(irq)` in exception.rs with a registration-
193+
based dispatch:
194+
195+
```rust
196+
static PCI_IRQ_HANDLERS: Mutex<[(u32, fn()); 16]>;
197+
198+
pub fn register_pci_irq(spi: u32, handler: fn()) { ... }
199+
```
200+
201+
This allows any PCI driver to register its own handler without modifying
202+
exception.rs.
203+
204+
### Full ITS Support
205+
206+
For GICv3 platforms (VMware, newer QEMU configs, real hardware):
207+
- ITS command queue management
208+
- LPI configuration tables (PROPBASER, PENDBASER)
209+
- Per-device interrupt translation
210+
- Multi-CPU interrupt routing via collections
211+
212+
### QEMU Virt INTx Mapping
213+
214+
QEMU virt machine maps PCI INTx to fixed SPIs:
215+
- INTA -> SPI 3 (GIC INTID 35)
216+
- INTB -> SPI 4 (GIC INTID 36)
217+
- INTC -> SPI 5 (GIC INTID 37)
218+
- INTD -> SPI 6 (GIC INTID 38)
219+
220+
With swizzling: `actual_pin = (slot + pin - 1) % 4`
221+
222+
These are level-triggered and shared, requiring ISR checks per device.
223+
224+
---
225+
226+
## Architecture Reference
227+
228+
### Current Packet Receive Path (Polling)
229+
230+
```
231+
Timer interrupt (1000Hz)
232+
-> every 10th tick: raise_softirq(NetRx)
233+
-> net_rx_softirq_handler()
234+
-> process_rx()
235+
-> net_pci::receive() / e1000::receive()
236+
-> process_packet()
237+
-> udp::enqueue_packet() / tcp::handle_segment()
238+
-> wake blocked thread
239+
```
240+
241+
Latency: 0-10ms (mean 5ms) per packet.
242+
243+
### Target Packet Receive Path (MSI)
244+
245+
```
246+
NIC MSI interrupt -> GIC SPI
247+
-> exception.rs handle_irq()
248+
-> net_pci::handle_interrupt()
249+
-> read ISR (auto-ack)
250+
-> raise_softirq(NetRx)
251+
-> net_rx_softirq_handler()
252+
-> process_rx()
253+
-> ... (same as above)
254+
```
255+
256+
Latency: <100us per packet (GIC + softirq overhead).
257+
258+
### MSI Delivery on Parallels (GICv2m)
259+
260+
```
261+
Device writes MSI data to GICv2m doorbell address:
262+
addr = GICV2M_BASE + 0x40 (MSI_SETSPI_NS)
263+
data = allocated SPI number
264+
265+
GICv2m translates write to GIC SPI assertion.
266+
GIC delivers SPI to target CPU via ICC_IAR1_EL1.
267+
```

0 commit comments

Comments
 (0)