This is a layer that demonstrates how to emulate functionality - in this case, the cl_khr_command_buffer extension and the related cl_khr_command_buffer_mutable_dispatch extensions - using a layer.
It works by intercepting calls to clGetExtensionFunctionAddressForPlatform to query function pointers for the cl_khr_command_buffer and cl_khr_command_buffer_mutable_dispatch extension APIs.
If a query succeeds by default then the layer does nothing and simply returns the queried function pointer as-is.
If the query is unsuccessful however, then the layer returns its own function pointer, which will record the contents of the command buffer for later playback.
This command buffer emulation layer currently implements v0.9.8 of the cl_khr_command_buffer extension and v0.9.5 of the cl_khr_command_buffer_mutable_dispatch extension.
The functionality in this emulation layer is sufficient to run the command buffer samples in this repository.
Please note that the emulated command buffers are intended to be functional, but unlike a native implementation, they may not provide any performance benefit over similar code without using command buffers.
Because this layer calls clCloneKernel when recording a command buffer it requires an OpenCL 2.1 or newer device.
If an older device is detected then the layer will not advertise support for the cl_khr_command_buffer or cl_khr_command_buffer_mutable_dispatch extensions.
The most important concepts to understand from this sample are how to intercept clGetExtensionFunctionAddressForPlatform to return emulated functions for an extension.
clGetExtensionFunctionAddressForPlatform
clInitLayerThe following environment variables can modify the behavior of the command buffer emulation layer:
| Environment Variable | Behavior | Example Format |
|---|---|---|
CMDBUFEMU_EnhancedErrorChecking |
Enables additional error checking when commands are added to a command buffer using a command buffer "test queue". By default, the additional error checking is disabled. | export CMDBUFEMU_EnhancedErrorChecking=1set CMDBUFEMU_EnhancedErrorChecking=1 |
CMDBUFEMU_KernelForProfiling |
Enables use of an empty kernel for event profiling instead of event profiling on a command-queue barrier. By default, to minimize overhead, the empty kernel is not used. | export CMDBUFEMU_KernelForProfiling=1set CMDBUFEMU_KernelForProfiling=1 |
CMDBUFEMU_SuggestedLocalWorkSize |
Enables use of the suggested local work-group size extension to eliminate NULL local work-group sizes. Only valid when an implementation supports the local work-group size extension and the command is not mutable. By default, use of the suggested local work-group size is enabled. |
export CMDBUFEMU_SuggestedLocalWorkSize=0set CMDBUFEMU_SuggestedLocalWorkSize=0 |
This section describes some of the limitations of the emulated cl_khr_command_buffer functionality:
- Some error conditions are not properly checked for and returned.
- Deferred kernel arguments are supported, but
CL_COMMAND_BUFFER_STATE_FINALIZED_KHRis not properly handled. - Many functions are not thread safe.