Skip to content

Commit eeae87e

Browse files
authored
Add specification for cl_khr_subgroup_rotate (#781)
Signed-off-by: Kevin Petit <kevin.petit@arm.com> Change-Id: I3acf34660b83c65533d244c7ceb17ec5219d1949
1 parent eaac933 commit eeae87e

4 files changed

Lines changed: 134 additions & 0 deletions

File tree

OpenCL_Ext.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -103,6 +103,8 @@ include::ext/cl_khr_external_memory.asciidoc[]
103103
include::ext/cl_khr_command_buffer.asciidoc[]
104104
include::ext/cl_khr_expect_assume.asciidoc[]
105105

106+
include::ext/cl_khr_subgroup_rotate.asciidoc[]
107+
106108
// NOTE: To keep meaningful section numbers, new
107109
// extension documents should be added above here!
108110

env/extensions.asciidoc

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -347,6 +347,14 @@ If the OpenCL environment supports the extension `cl_khr_expect_assume` and use
347347

348348
* *ExpectAssumeKHR*
349349

350+
==== `cl_khr_subgroup_rotate`
351+
352+
If the OpenCL environment supports the extension `cl_khr_subgroup_rotate`,
353+
then the environment accept modules that require `SPV_KHR_subgroup_rotate` and
354+
declare the following SPIR-V capabilities:
355+
356+
* *GroupNonUniformRotateKHR*
357+
350358
=== Embedded Profile Extensions
351359

352360
==== `cles_khr_int64`
Lines changed: 120 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,120 @@
1+
// Copyright 2021 The Khronos Group. This work is licensed under a
2+
// Creative Commons Attribution 4.0 International License; see
3+
// http://creativecommons.org/licenses/by/4.0/
4+
5+
[[cl_khr_subgroup_rotate]]
6+
== Subgroup Rotation
7+
8+
This extension adds support for a new subgroup data exchange operation that
9+
makes it possible to rotate values through the work items in a subgroup.
10+
11+
=== General Information
12+
13+
==== Name Strings
14+
15+
`cl_khr_subgroup_rotate`
16+
17+
==== Version History
18+
19+
[cols="1,1,3",options="header",]
20+
|====
21+
| *Date* | *Version* | *Description*
22+
| 2022-04-22 | 1.0.0 | Initial version.
23+
|====
24+
25+
==== Dependencies
26+
27+
This extension is written against the OpenCL Specification Version 3.0.10,
28+
and OpenCL C Specification Version 3.0.10 and OpenCL Environment Specification
29+
Version 3.0.10.
30+
31+
This extension requires OpenCL 2.0.
32+
33+
==== Contributors
34+
35+
Kévin Petit, Arm Ltd. +
36+
Ben Ashbaugh, Intel +
37+
Ruihao Zhang, Qualcomm +
38+
Sven van Haastregt, Arm Ltd. +
39+
Anastasia Stulova, Arm Ltd. +
40+
Stuart Brady, Arm Ltd. +
41+
42+
=== New OpenCL C Functions
43+
44+
This extension adds the following built-in function:
45+
46+
[source,c]
47+
----
48+
gentype sub_group_rotate(gentype value, int delta)
49+
gentype sub_group_clustered_rotate(gentype value, int delta, uint clustersize)
50+
----
51+
52+
=== Modifications to the OpenCL C Specification
53+
54+
(Add a new section 6.15.x, *Subgroup Rotation*) ::
55+
+
56+
--
57+
58+
The following preprocessor definitions are added:
59+
60+
[source,c]
61+
----
62+
#define cl_khr_subgroup_rotate 1
63+
----
64+
65+
The table below describes a specialized OpenCL C programming language built-in
66+
function that allow work items in a subgroup to exchange data. This function
67+
need not be encountered by all work items in a subgroup executing the kernel.
68+
For the functions below, the generic type name `gentype` may be one of the
69+
supported built-in scalar data types `char`, `uchar`, `short`, `ushort`, `int`,
70+
`uint`, `long`, `ulong`, `float`, `double` (if double precision is supported),
71+
or `half` (if half precision is supported).
72+
73+
[cols="1a,1",options="header",]
74+
|=======================================================================
75+
|*Function*
76+
|*Description*
77+
78+
|[source,c]
79+
----
80+
gentype sub_group_rotate(
81+
gentype value, int delta)
82+
----
83+
| Returns _value_ for the work item with subgroup local ID equal to the remainder
84+
of the division of the sum of this work item's subgroup local ID and _delta_ by
85+
the maximum subgroup size. +
86+
The value of _delta_ is required to be dynamically-uniform for all work items in
87+
the subgroup, otherwise the behavior is undefined.
88+
89+
The return value is undefined if the work item with subgroup local ID equal to the
90+
calculated index is inactive.
91+
92+
|[source,c]
93+
----
94+
gentype sub_group_clustered_rotate(
95+
gentype value, int delta, uint clustersize)
96+
----
97+
| Returns _value_ for the work item with subgroup local ID equal to the sum of, the
98+
remainder of the division of the sum of this work item's ID within the cluster and
99+
_delta_ by _clustersize_, and the subgroup local ID of the first work-item of the
100+
cluster to which the work-item executing the function belongs. +
101+
The value of _delta_ is required to be dynamically-uniform for all work items in
102+
the subgroup, otherwise the behavior is undefined.
103+
104+
_clustersize_ must be an integer constant expression and a power of two, smaller
105+
than or equal to the maximum subgroup size, otherwise the behavior is undefined.
106+
107+
The return value is undefined if the work item with subgroup local ID equal to the
108+
calculated index is inactive.
109+
|=======================================================================
110+
--
111+
112+
=== Modifications to the OpenCL SPIR-V Environment Specification
113+
114+
See OpenCL SPIR-V Environment Specification.
115+
116+
=== Interactions with Other Extensions
117+
118+
If `cl_khr_il_program` is supported then the SPIR-V environment specification
119+
modifications described above apply.
120+

ext/quick_reference.asciidoc

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -257,6 +257,10 @@
257257
| Relative Shuffles Among Sub-Groupings of Work Items
258258
| Extension
259259

260+
| <<cl_khr_subgroup_rotate,cl_khr_subgroup_rotate>>
261+
| Rotation Among Sub-Groupings of Work Items
262+
| Extension
263+
260264
| <<cl_khr_suggested_local_work_size,cl_khr_suggested_local_work_size>>
261265
| Query a Suggested Local Work Size
262266
| Extension

0 commit comments

Comments
 (0)