Skip to content

[GSD-12416] Is DG1 supported with Linux xe KMD? #899

@ky-bd

Description

@ky-bd

I see that there are xe KMD support and DG1 support, but I can't get them working together. clinfo -l with the following debug settings shows:

export NEOReadDebugKeys=1
export PrintDriverDiagnostics=5
export PrintDebugSettings=1
export PrintDebugMessages=1
export PrintXeLogs=1
export PrintBOCreateDestroyResult=1
# clinfo -l
Non-default value of debug variable: PrintDriverDiagnostics = 5
Non-default value of debug variable: PrintDebugSettings = 1
Non-default value of debug variable: PrintDebugMessages = 1
Non-default value of debug variable: PrintXeLogs = 1
Non-default value of debug variable: PrintBOCreateDestroyResult = 1
Shared System USM NOT allowed: KMD does not support
EXT_SET_PAT support is: disabled
INFO: System Info query failed!
WARNING: Failed to query memory info
WARNING: Failed to query engine info
WARNING: Topology query failed!
FATAL: Cannot query EU total parameter!
Platform #0: Intel(R) OpenCL
 `-- Device #0: AMD Ryzen 9 5900X 12-Core Processor

strace clinfo -l output (the ioctl operations):

...
openat(AT_FDCWD, "/dev/dri/by-path", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 5
fstat(5, {st_mode=S_IFDIR|0755, st_size=80, ...}) = 0
getdents64(5, 0x571b5acd2090 /* 4 entries */, 32768) = 144
getdents64(5, 0x571b5acd2090 /* 0 entries */, 32768) = 0
close(5)                                = 0
openat(AT_FDCWD, "/dev/dri/by-path/pci-0000:2f:00.0-render", O_RDWR|O_CLOEXEC) = 5
ioctl(5, DRM_IOCTL_VERSION, 0x7fff929d60a0) = 0
ioctl(5, DRM_IOCTL_VERSION, 0x7fff929d6080) = 0
ioctl(5, DRM_IOCTL_XE_DEVICE_QUERY, 0x7fff929d6140) = 0
ioctl(5, DRM_IOCTL_XE_DEVICE_QUERY, 0x7fff929d6140) = 0
write(2, "Shared System USM NOT allowed: K"..., 52Shared System USM NOT allowed: KMD does not support
) = 52
ioctl(5, DRM_IOCTL_VERSION, 0x7fff929d5cf0) = 0
ioctl(5, DRM_IOCTL_I915_GEM_CREATE_EXT, 0x7fff929d5bf0) = -1 EINVAL
write(1, "EXT_SET_PAT support is: disabled"..., 33EXT_SET_PAT support is: disabled
) = 33
ioctl(5, DRM_IOCTL_I915_REG_READ, 0x7fff929d5bf0) = -1 EINVAL
ioctl(5, DRM_IOCTL_I915_REG_READ, 0x7fff929d5bf0) = -1 EINVAL
readlink("/proc/self/exe", "/usr/bin/clinfo", 511) = 15
ioctl(5, DRM_IOCTL_I915_QUERY, 0x7fff929d5ca0) = -1 EINVAL
write(1, "INFO: System Info query failed!\n", 32INFO: System Info query failed!
) = 32
ioctl(5, DRM_IOCTL_I915_QUERY, 0x7fff929d5b80) = -1 EINVAL
write(2, "WARNING: Failed to query memory "..., 37WARNING: Failed to query memory info
) = 37
ioctl(5, DRM_IOCTL_I915_QUERY, 0x7fff929d5a20) = -1 EINVAL
write(2, "WARNING: Failed to query engine "..., 37WARNING: Failed to query engine info
) = 37
ioctl(5, DRM_IOCTL_I915_QUERY, 0x7fff929d5b30) = -1 EINVAL
write(2, "WARNING: Topology query failed!\n", 32WARNING: Topology query failed!
) = 32
ioctl(5, DRM_IOCTL_I915_GETPARAM, 0x7fff929d5bf0) = -1 EINVAL
write(2, "FATAL: Cannot query EU total par"..., 40FATAL: Cannot query EU total parameter!
) = 40
close(5)                                = 0
munmap(0x720be1000000, 35286328)        = 0
munmap(0x720be3643000, 864456)          = 0
close(4)                                = 0
...

By looking into the code, I found that the runtime queries the device with IoctlHelperXe if xe KMD is found:

bool Drm::queryDeviceIdAndRevision() {
auto drmVersion = Drm::getDrmVersion(getFileDescriptor());
if ("xe" == drmVersion) {
this->setPerContextVMRequired(false);
return IoctlHelperXe::queryDeviceIdAndRevision(*this);
}
return IoctlHelperI915::queryDeviceIdAndRevision(*this);
}

... but sets up a product-specific ioctl helper for DG1:

void Drm::setupIoctlHelper(const PRODUCT_FAMILY productFamily) {
if (!this->ioctlHelper) {
auto drmVersion = Drm::getDrmVersion(getFileDescriptor());
auto productSpecificIoctlHelperCreator = ioctlHelperFactory[productFamily];
if (productSpecificIoctlHelperCreator && !debugManager.flags.IgnoreProductSpecificIoctlHelper.get()) {
this->ioctlHelper = productSpecificIoctlHelperCreator.value()(*this);
} else if ("xe" == drmVersion) {
this->ioctlHelper = IoctlHelperXe::create(*this);
} else {
std::string prelimVersion = "";
getPrelimVersion(prelimVersion);
this->ioctlHelper = IoctlHelper::getI915Helper(productFamily, prelimVersion, *this);
}
this->ioctlHelper->initialize();
}
}

... which is an IoctlHelperImpl<IGFX_DG1>:

struct EnableProductIoctlHelperDg1 {
EnableProductIoctlHelperDg1() {
ioctlHelperFactory[IGFX_DG1] = IoctlHelperImpl<IGFX_DG1>::get;
}
};
static EnableProductIoctlHelperDg1 enableIoctlHelperDg1;

... and it is inherited from the IoctlHelperUpstream then IoctlHelperI915, instead of the IoctlHelperXe.

template <PRODUCT_FAMILY gfxProduct>
class IoctlHelperImpl : public IoctlHelperUpstream {

class IoctlHelperUpstream : public IoctlHelperI915 {

So the device is queried with i915 ioctl commands and causes the clinfo failed to probe the device.

The clinfo works fine with i915 KMD:

# clinfo -l
Non-default value of debug variable: PrintDriverDiagnostics = 5
Non-default value of debug variable: PrintDebugSettings = 1
Non-default value of debug variable: PrintDebugMessages = 1
Non-default value of debug variable: PrintXeLogs = 1
Non-default value of debug variable: PrintBOCreateDestroyResult = 1
EXT_SET_PAT support is: disabled
INFO: System Info query failed!
WARNING: Failed to request OCL Turbo Boost
Created new BO with GEM_USERPTR, handle: BO-1
NEO_CACHE_PERSISTENT is enabled. Cache is located in: /root/.cache/neo_compiler_cache

Performing GEM_CREATE_EXT with { size: 4096, memory class: 0, memory instance: 0 }
GEM_CREATE_EXT with EXT_MEMORY_REGIONS has returned: 0 BO-2 with size: 4096
Performing GEM_CREATE_EXT with { size: 4096, memory class: 0, memory instance: 0 }
GEM_CREATE_EXT with EXT_MEMORY_REGIONS has returned: 0 BO-3 with size: 4096
Performing GEM_CREATE_EXT with { size: 4096, memory class: 0, memory instance: 0 }
GEM_CREATE_EXT with EXT_MEMORY_REGIONS has returned: 0 BO-4 with size: 4096
Performing GEM_CREATE_EXT with { size: 4096, memory class: 0, memory instance: 0 }
GEM_CREATE_EXT with EXT_MEMORY_REGIONS has returned: 0 BO-5 with size: 4096
Performing GEM_CREATE_EXT with { size: 4096, memory class: 0, memory instance: 0 }
GEM_CREATE_EXT with EXT_MEMORY_REGIONS has returned: 0 BO-6 with size: 4096
computeUnitsUsedForScratch: 768
hwInfo: {96, 672}: (16, 1, 6)
Platform #0: Intel(R) OpenCL Graphics
 `-- Device #0: Intel(R) Iris(R) Xe MAX Graphics
Platform #1: Intel(R) OpenCL
 `-- Device #0: AMD Ryzen 9 5900X 12-Core Processor
Calling gem close on handle: BO-2
Calling gem close on handle: BO-3
Calling gem close on handle: BO-4
Calling gem close on handle: BO-5
Calling gem close on handle: BO-6
Calling gem close on handle: BO-1

Environment

  • CPU: AMD Ryzen 9 5900X
  • GPU: Intel Corporation DG1 [Iris Xe MAX Graphics] [8086:4905] (rev 01)
  • MB: MSI MAG X570 TOMAHAWK WIFI (MS-7C84) (UEFI boot, resizable BAR and above 4G decoding enabled, CSM disabled)
  • OS: Ubuntu 24.04.4 LTS
  • Kernel: 6.17.0-14-generic
  • Kernel cmdline: xe.force_probe=4905 acpi_enforce_resources=lax pci=nommconf pcie_acs_override=downstream pcie_aspm=off
  • Driver: xe (i915 blacklisted in modprobe.d)
  • OpenCL loader: Khronos OpenCL ICD Loader 3.0.7
  • OpenCL runtime (ICD): intel-opencl-icd 26.05.37020.3-1~24.04~ppa2
  • Level Zero: libze-intel-gpu1 26.05.37020.3-1~24.04~ppa2, libze1 1.27.0-1~24.04~ppa1
  • GMM: libigdgmm12 22.9.0-1~24.04~ppa1
  • Intel GSC: 0.9.5-1~24.04~ppa2
  • linux-firmware: 20240318.git3b128b60-0ubuntu2.25

Metadata

Metadata

Assignees

No one assigned

    Labels

    OS: LinuxIssue specific to Linux distributions (Ubuntu, Fedora, RHEL, etc.)Status: BacklogConfirmed issue; pending scheduling or queued awaiting resources

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions