System Environment:
OS: Arch Linux (UKI boot, strict lockdown/security policies enabled)
Kernel: 7.0.3-arch1-2 (PREEMPT_DYNAMIC)
CPU/iGPU: Intel Core i7-1260P (Alder Lake-P) | ID: 46a6
dGPU: Intel Arc A370M (DG2) | ID: 5693
Issue Description:
The xe driver encounters a fatal NULL pointer dereference (address: 00000000000005d8) when the power management subsystem attempts to put the display into a runtime suspend state.
The crash occurs specifically during the memory cleanup phase within xe_display_flush_cleanup_work. It appears the driver fails to pass or retain a valid pointer to the display data structure before entering sleep mode. The processor correctly catches the invalid memory access and triggers a kernel panic to prevent data corruption.
Steps to Reproduce:
Boot the system with parameters to strictly bind both GPUs to xe:
xe.force_probe=46a6,5693 i915.force_probe=!46a6,!5693
Allow the system to initialize the DRM devices (both devices probe and initialize successfully).
Wait for the PM subsystem to trigger a runtime suspend (e.g., leaving the system idle shortly after boot).
System triggers a Kernel Panic.
Key Technical Observations:
Targeted Failure: The crash is strictly isolated to the xe driver's display power management logic. Other PCI subsystems (such as Intel AX211 Wi-Fi and Bluetooth) survive the graphical crash and remain fully initialized and functional in the background.
Consistency: The behavior is persistent and identical across multiple kernels (tested from 6.17.x up to 7.0.3), confirming a core logic error in the driver rather than a transient kernel bug.
Hardware State: The system is otherwise completely stable. Thermal states are normal (no fan spin-up or throttling), and security modules (AppArmor, Lockdown) do not interfere with the driver initialization prior to the suspend trigger.
Relevant Call Trace:
Plaintext
BUG: kernel NULL pointer dereference, address: 00000000000005d8
...
xe_display_flush_cleanup_work+0x96/0x140 [xe]
xe_display_pm_runtime_suspend+0x4b/0x90 [xe]
xe_pm_runtime_suspend+0x147/0x300 [xe]
xe_pci_runtime_suspend+0x2a/0xe0 [xe]
pci_pm_runtime_suspend+0x78/0x210
Full dmesg log attached below.
System Environment:
Issue Description:
The xe driver encounters a fatal NULL pointer dereference (address: 00000000000005d8) when the power management subsystem attempts to put the display into a runtime suspend state.
The crash occurs specifically during the memory cleanup phase within xe_display_flush_cleanup_work. It appears the driver fails to pass or retain a valid pointer to the display data structure before entering sleep mode. The processor correctly catches the invalid memory access and triggers a kernel panic to prevent data corruption.
Steps to Reproduce:
Key Technical Observations:
Relevant Call Trace:
Plaintext
BUG: kernel NULL pointer dereference, address: 00000000000005d8
...
xe_display_flush_cleanup_work+0x96/0x140 [xe]
xe_display_pm_runtime_suspend+0x4b/0x90 [xe]
xe_pm_runtime_suspend+0x147/0x300 [xe]
xe_pci_runtime_suspend+0x2a/0xe0 [xe]
pci_pm_runtime_suspend+0x78/0x210
Full dmesg log attached below.