|
| 1 | +// Copyright 2021 The Khronos Group. This work is licensed under a |
| 2 | +// Creative Commons Attribution 4.0 International License; see |
| 3 | +// http://creativecommons.org/licenses/by/4.0/ |
| 4 | + |
| 5 | +[[cl_khr_subgroup_rotate]] |
| 6 | +== Subgroup Rotation |
| 7 | + |
| 8 | +This extension adds support for a new subgroup data exchange operation that |
| 9 | +makes it possible to rotate values through the work items in a subgroup. |
| 10 | + |
| 11 | +=== General Information |
| 12 | + |
| 13 | +==== Name Strings |
| 14 | + |
| 15 | +`cl_khr_subgroup_rotate` |
| 16 | + |
| 17 | +==== Version History |
| 18 | + |
| 19 | +[cols="1,1,3",options="header",] |
| 20 | +|==== |
| 21 | +| *Date* | *Version* | *Description* |
| 22 | +| 2022-04-22 | 1.0.0 | Initial version. |
| 23 | +|==== |
| 24 | + |
| 25 | +==== Dependencies |
| 26 | + |
| 27 | +This extension is written against the OpenCL Specification Version 3.0.10, |
| 28 | +and OpenCL C Specification Version 3.0.10 and OpenCL Environment Specification |
| 29 | +Version 3.0.10. |
| 30 | + |
| 31 | +This extension requires OpenCL 2.0. |
| 32 | + |
| 33 | +==== Contributors |
| 34 | + |
| 35 | +Kévin Petit, Arm Ltd. + |
| 36 | +Ben Ashbaugh, Intel + |
| 37 | +Ruihao Zhang, Qualcomm + |
| 38 | +Sven van Haastregt, Arm Ltd. + |
| 39 | +Anastasia Stulova, Arm Ltd. + |
| 40 | +Stuart Brady, Arm Ltd. + |
| 41 | + |
| 42 | +=== New OpenCL C Functions |
| 43 | + |
| 44 | +This extension adds the following built-in function: |
| 45 | + |
| 46 | +[source,c] |
| 47 | +---- |
| 48 | +gentype sub_group_rotate(gentype value, int delta) |
| 49 | +gentype sub_group_clustered_rotate(gentype value, int delta, uint clustersize) |
| 50 | +---- |
| 51 | + |
| 52 | +=== Modifications to the OpenCL C Specification |
| 53 | + |
| 54 | +(Add a new section 6.15.x, *Subgroup Rotation*) :: |
| 55 | ++ |
| 56 | +-- |
| 57 | + |
| 58 | +The following preprocessor definitions are added: |
| 59 | + |
| 60 | +[source,c] |
| 61 | +---- |
| 62 | +#define cl_khr_subgroup_rotate 1 |
| 63 | +---- |
| 64 | + |
| 65 | +The table below describes a specialized OpenCL C programming language built-in |
| 66 | +function that allow work items in a subgroup to exchange data. This function |
| 67 | +need not be encountered by all work items in a subgroup executing the kernel. |
| 68 | +For the functions below, the generic type name `gentype` may be one of the |
| 69 | +supported built-in scalar data types `char`, `uchar`, `short`, `ushort`, `int`, |
| 70 | +`uint`, `long`, `ulong`, `float`, `double` (if double precision is supported), |
| 71 | +or `half` (if half precision is supported). |
| 72 | + |
| 73 | +[cols="1a,1",options="header",] |
| 74 | +|======================================================================= |
| 75 | +|*Function* |
| 76 | +|*Description* |
| 77 | + |
| 78 | +|[source,c] |
| 79 | +---- |
| 80 | +gentype sub_group_rotate( |
| 81 | + gentype value, int delta) |
| 82 | +---- |
| 83 | +| Returns _value_ for the work item with subgroup local ID equal to the remainder |
| 84 | +of the division of the sum of this work item's subgroup local ID and _delta_ by |
| 85 | +the maximum subgroup size. + |
| 86 | +The value of _delta_ is required to be dynamically-uniform for all work items in |
| 87 | +the subgroup, otherwise the behavior is undefined. |
| 88 | + |
| 89 | +The return value is undefined if the work item with subgroup local ID equal to the |
| 90 | +calculated index is inactive. |
| 91 | + |
| 92 | +|[source,c] |
| 93 | +---- |
| 94 | +gentype sub_group_clustered_rotate( |
| 95 | + gentype value, int delta, uint clustersize) |
| 96 | +---- |
| 97 | +| Returns _value_ for the work item with subgroup local ID equal to the sum of, the |
| 98 | +remainder of the division of the sum of this work item's ID within the cluster and |
| 99 | +_delta_ by _clustersize_, and the subgroup local ID of the first work-item of the |
| 100 | +cluster to which the work-item executing the function belongs. + |
| 101 | +The value of _delta_ is required to be dynamically-uniform for all work items in |
| 102 | +the subgroup, otherwise the behavior is undefined. |
| 103 | + |
| 104 | +_clustersize_ must be an integer constant expression and a power of two, smaller |
| 105 | +than or equal to the maximum subgroup size, otherwise the behavior is undefined. |
| 106 | + |
| 107 | +The return value is undefined if the work item with subgroup local ID equal to the |
| 108 | +calculated index is inactive. |
| 109 | +|======================================================================= |
| 110 | +-- |
| 111 | + |
| 112 | +=== Modifications to the OpenCL SPIR-V Environment Specification |
| 113 | + |
| 114 | +See OpenCL SPIR-V Environment Specification. |
| 115 | + |
| 116 | +=== Interactions with Other Extensions |
| 117 | + |
| 118 | +If `cl_khr_il_program` is supported then the SPIR-V environment specification |
| 119 | +modifications described above apply. |
| 120 | + |
0 commit comments