You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: cxx/annotation.txt
+32-32Lines changed: 32 additions & 32 deletions
Original file line number
Diff line number
Diff line change
@@ -6,19 +6,19 @@
6
6
7
7
'''''
8
8
[[ftn2]]
9
-
<<ftnref2,[2]>> The `double` data type is an optional type that is supported if `CL_DEVICE_DOUBLE_FP_CONFIG` in table 4.3 for a device is not zero.
9
+
[2] The `double` data type is an optional type that is supported if `CL_DEVICE_DOUBLE_FP_CONFIG` in table 4.3 for a device is not zero.
10
10
11
11
[[ftn3]]
12
-
<<ftnref3,[3]>> The question mark ? in numerical selector refers to special undefined component of vector; reading from it results in undefined value, writing to it is discarded.
12
+
[3] The question mark ? in numerical selector refers to special undefined component of vector; reading from it results in undefined value, writing to it is discarded.
13
13
14
14
[[ftn4]]
15
-
<<ftnref4,[4]>> Only if the *cl_khr_fp16* extension is enabled and has been supported
15
+
[4] Only if the *cl_khr_fp16* extension is enabled and has been supported
16
16
17
17
[[ftn5]]
18
-
<<ftnref5,[5]>> For conversions to floating-point format, when a finite source value exceeds the maximum representable finite floating-point destination value, the rounding mode will affect whether the result is the maximum finite floating-point value or infinity of same sign as the source value, per IEEE-754 rules for rounding.
18
+
[5] For conversions to floating-point format, when a finite source value exceeds the maximum representable finite floating-point destination value, the rounding mode will affect whether the result is the maximum finite floating-point value or infinity of same sign as the source value, per IEEE-754 rules for rounding.
19
19
20
20
[[ftn6]]
21
-
<<ftnref6,[6]>> The `as_type<T>` function is intended to reflect the organization of data in register.
21
+
[6] The `as_type<T>` function is intended to reflect the organization of data in register.
22
22
The `as_type<T>` construct is intended to compile to no instructions on devices that use a shared register file designed to operate on both the operand and result types.
23
23
Note that while differences in memory organization are expected to largely be limited to those arising from endianness, the register based representation may also differ due to size of the element in register.
24
24
(For example, an architecture may load a char into a 32-bit register, or a char vector into a SIMD vector register with fixed 32-bit element size.)
@@ -27,91 +27,91 @@ So, for example if an implementation stores all single precision data as double
27
27
If data stored in different address spaces do not have the same endianness, then the "dominant endianness" of the device should prevail.
28
28
29
29
[[ftn7]]
30
-
<<ftnref7,[7]>> `memory_order_consume` is not supported in OpenCL {cpp}
30
+
[7] `memory_order_consume` is not supported in OpenCL {cpp}
31
31
32
32
[[ftn8]]
33
-
<<ftnref8,[8]>> This value for `memory_scope` can only be used with `atomic_fence` with flags set to `mem_fence::image`.
33
+
[8] This value for `memory_scope` can only be used with `atomic_fence` with flags set to `mem_fence::image`.
34
34
35
35
[[ftn9]]
36
-
<<ftnref9,[9]>> We can't require {cpp14} atomics since host programs can be implemented in other programming languages and versions of C or {cpp}, but we do require that the host programs use atomics and that those atomics be compatible with those in {cpp14}.
36
+
[9] We can't require {cpp14} atomics since host programs can be implemented in other programming languages and versions of C or {cpp}, but we do require that the host programs use atomics and that those atomics be compatible with those in {cpp14}.
37
37
38
38
[[ftn10]]
39
-
<<ftnref10,[10]>> The `atomic_long` and `atomic_ulong` types are supported if the *cl_khr_int64_base_atomics* and *cl_khr_int64_extended_atomics* extensions are supported and have been enabled.
39
+
[10] The `atomic_long` and `atomic_ulong` types are supported if the *cl_khr_int64_base_atomics* and *cl_khr_int64_extended_atomics* extensions are supported and have been enabled.
40
40
41
41
[[ftn11]]
42
-
<<ftnref11,[11]>> The `atomic_double` type is only supported if double precision is supported and the *cl_khr_int64_base_atomics* and *cl_khr_int64_extended_atomics* extensions are supported and have been enabled.
42
+
[11] The `atomic_double` type is only supported if double precision is supported and the *cl_khr_int64_base_atomics* and *cl_khr_int64_extended_atomics* extensions are supported and have been enabled.
43
43
44
44
[[ftn12]]
45
-
<<ftnref12,[12]>> If the device address space is 64-bits, the data types `atomic_intptr_t`, `atomic_uintptr_t`, `atomic_size_t` and `atomic_ptrdiff_t` are supported only if the *cl_khr_int64_base_atomics* and *cl_khr_int64_extended_atomics* extensions are supported and have been enabled.
45
+
[12] If the device address space is 64-bits, the data types `atomic_intptr_t`, `atomic_uintptr_t`, `atomic_size_t` and `atomic_ptrdiff_t` are supported only if the *cl_khr_int64_base_atomics* and *cl_khr_int64_extended_atomics* extensions are supported and have been enabled.
46
46
47
47
[[ftn13]]
48
-
<<ftnref13,[13]>> The `\*_ms` types are supported only if the *cl_khr_gl_msaa_sharing* and *cl_khr_gl_depth_images* extensions are supported and have been enabled.
48
+
[13] The `\*_ms` types are supported only if the *cl_khr_gl_msaa_sharing* and *cl_khr_gl_depth_images* extensions are supported and have been enabled.
49
49
50
50
[[ftn14]]
51
-
<<ftnref14,[14]>> Immediate meaning not side effects resulting from child kernels. The side effects would include stores to global memory and pipe reads and writes.
51
+
[14] Immediate meaning not side effects resulting from child kernels. The side effects would include stores to global memory and pipe reads and writes.
52
52
53
53
[[ftn15]]
54
-
<<ftnref15,[15]>> This acts as a memory synchronization point between work-items in a work-group and child kernels enqueued by work-items in the work-group.
54
+
[15] This acts as a memory synchronization point between work-items in a work-group and child kernels enqueued by work-items in the work-group.
55
55
56
56
// Footnote 16 removed - duplicated footnote 17
57
57
58
58
[[ftn17]]
59
-
<<ftnref17,[17]>> i.e. the `global_work_size` values specified to `clEnqueueNDRangeKernel` are not evenly divisible by the `local_work_size` values for each dimension.
59
+
[17] i.e. the `global_work_size` values specified to `clEnqueueNDRangeKernel` are not evenly divisible by the `local_work_size` values for each dimension.
60
60
61
61
[[ftn18]]
62
-
<<ftnref18,[18]>> Only if double precision is supported and has been enabled.
62
+
[18] Only if double precision is supported and has been enabled.
63
63
64
64
[[ftn19]]
65
-
<<ftnref19,[19]>> Refer to the <<order-and-scope, _Memory order and scope_>> section for description of `memory_scope`.
65
+
[19] Refer to the <<order-and-scope, _Memory order and scope_>> section for description of `memory_scope`.
66
66
67
67
[[ftn20]]
68
-
<<ftnref20,[20]>> The `min()` operator is there to prevent `fract(-small)` from returning 1.0.
68
+
[20] The `min()` operator is there to prevent `fract(-small)` from returning 1.0.
69
69
It returns the largest positive floating-point number less than 1.0.
70
70
71
71
[[ftn21]]
72
-
<<ftnref21,[21]>> fmin and fmax behave as defined by {cpp14} and may not match the IEEE 754-2008 definition for minNum and maxNum with regard to signaling NaNs.
72
+
[21] fmin and fmax behave as defined by {cpp14} and may not match the IEEE 754-2008 definition for minNum and maxNum with regard to signaling NaNs.
73
73
Specifically, signaling NaNs may behave as quiet NaNs.
74
74
75
75
[[ftn22]]
76
-
<<ftnref22,[22]>> The user is cautioned that for some usages, e.g. `mad(a, b, -a*b)`, the definition of `mad()` in the embedded profile is loose enough that almost any result is allowed from `mad()` for some values of `a` and `b`.
76
+
[22] The user is cautioned that for some usages, e.g. `mad(a, b, -a*b)`, the definition of `mad()` in the embedded profile is loose enough that almost any result is allowed from `mad()` for some values of `a` and `b`.
77
77
78
78
[[ftn23]]
79
-
<<ftnref23,[23]>> Frequently vector operations need n + 1 bits temporarily to calculate a result.
79
+
[23] Frequently vector operations need n + 1 bits temporarily to calculate a result.
80
80
The rhadd instruction gives you an extra bit without needing to upsample and downsample. This can be a profound performance win.
81
81
82
82
[[ftn24]]
83
-
<<ftnref24,[24]>> The primary purpose of the printf function is to help in debugging OpenCL kernels.
83
+
[24] The primary purpose of the printf function is to help in debugging OpenCL kernels.
84
84
85
85
[[ftn25]]
86
-
<<ftnref25,[25]>> Note that _0_ is taken as a flag, not as the beginning of a field width.
86
+
[25] Note that _0_ is taken as a flag, not as the beginning of a field width.
87
87
88
88
[[ftn26]]
89
-
<<ftnref26,[26]>> The results of all floating conversions of a negative zero, and of negative values that round to zero, include a minus sign.
89
+
[26] The results of all floating conversions of a negative zero, and of negative values that round to zero, include a minus sign.
90
90
91
91
[[ftn27]]
92
-
<<ftnref27,[27]>> When applied to infinite and NaN values, the -, +, and space flag characters have their usual meaning; the # and _0_ flag characters have no effect.
92
+
[27] When applied to infinite and NaN values, the -, +, and space flag characters have their usual meaning; the # and _0_ flag characters have no effect.
93
93
94
94
[[ftn28]]
95
-
<<ftnref28,[28]>> Binary implementations can choose the hexadecimal digit to the left of the decimal-point character so that subsequent digits align to nibble (4-bit) boundaries.
95
+
[28] Binary implementations can choose the hexadecimal digit to the left of the decimal-point character so that subsequent digits align to nibble (4-bit) boundaries.
96
96
97
97
[[ftn29]]
98
-
<<ftnref29,[29]>> No special provisions are made for multibyte characters.
98
+
[29] No special provisions are made for multibyte characters.
99
99
The behavior of printf with the _s_ conversion specifier is undefined if the argument value is not a pointer to a literal string.
100
100
101
101
[[ftn30]]
102
-
<<ftnref30,[30]>> Except for the embedded profile whether either round to zero or round to nearest rounding mode may be supported for single precision floating-point.
102
+
[30] Except for the embedded profile whether either round to zero or round to nearest rounding mode may be supported for single precision floating-point.
103
103
104
104
[[ftn31]]
105
-
<<ftnref31,[31]>> The ULP values for built-in math functions `lgamma` and `lgamma_r` is currently undefined.
105
+
[31] The ULP values for built-in math functions `lgamma` and `lgamma_r` is currently undefined.
106
106
107
107
[[ftn32]]
108
-
<<ftnref32,[32]>> 0 ulp is used for math functions that do not require rounding.
108
+
[32] 0 ulp is used for math functions that do not require rounding.
109
109
110
110
[[ftn33]]
111
-
<<ftnref33,[33]>> On some implementations, `powr()` or `pown()` may perform faster than `pow()`.
111
+
[33] On some implementations, `powr()` or `pown()` may perform faster than `pow()`.
112
112
If `x` is known to be `>= 0`, consider using `powr()` in place of `pow()`, or if `y` is known to be an integer, consider using `pown()` in place of `pow()`.
113
113
114
114
// Footnote 34 removed - duplicated footnote 32
115
115
116
116
[[ftn35]]
117
-
<<ftnref35,[35]>> Here `TYPE_MIN` and `TYPE_MIN_EXP` should be substituted by constants appropriate to the floating-point type under consideration, such as `FLT_MIN` and `FLT_MIN_EXP` for float.
117
+
[35] Here `TYPE_MIN` and `TYPE_MIN_EXP` should be substituted by constants appropriate to the floating-point type under consideration, such as `FLT_MIN` and `FLT_MIN_EXP` for float.
Copy file name to clipboardExpand all lines: cxx/numerical_compliance/edge_case_behavior.txt
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -143,4 +143,4 @@ If subnormals are flushed to zero, a device may choose to conform to the followi
143
143
144
144
For clarity, subnormals or denormals are defined to be the set of representable numbers in the range 0 < x < `TYPE_MIN` and `-TYPE_MIN` < x < -0.
145
145
They do not include {plusmn}0.
146
-
A non-zero number is said to be sub-normal before rounding if after normalization, its radix-2 exponent is less than `(TYPE_MIN_EXP - 1)`. [[ftnref35]] <<ftn35,[35]>>
146
+
A non-zero number is said to be sub-normal before rounding if after normalization, its radix-2 exponent is less than `(TYPE_MIN_EXP - 1)`. <<ftn35,[35]>>
<<ulp_values_for_single_precision_builtin_math_functions,ULP values for single precision built-in math functions>> [[ftnref31]] <<ftn31,[31]>> table describes the minimum accuracy of single precision floating-point arithmetic operations given as ULP values.
28
+
<<ulp_values_for_single_precision_builtin_math_functions,ULP values for single precision built-in math functions>> <<ftn31,[31]>> table describes the minimum accuracy of single precision floating-point arithmetic operations given as ULP values.
29
29
The reference value used to compute the ULP value of an arithmetic operation is the infinitely precise result.
Copy file name to clipboardExpand all lines: cxx/numerical_compliance/rounding_modes.txt
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -13,6 +13,6 @@ IEEE 754 defines four possible rounding modes:
13
13
* Round toward -infinity.
14
14
* Round toward zero.
15
15
16
-
_Round to nearest even_ is currently the only rounding mode required [[ftnref30]] <<ftn30,[30]>> by the OpenCL specification for single precision and double precision operations and is therefore the default rounding mode.
16
+
_Round to nearest even_ is currently the only rounding mode required <<ftn30,[30]>> by the OpenCL specification for single precision and double precision operations and is therefore the default rounding mode.
17
17
In addition, only static selection of rounding mode is supported.
18
18
Static and dynamic selection of rounding mode is not supported.
Copy file name to clipboardExpand all lines: cxx/stdlib/atomic_operations.txt
+4-4Lines changed: 4 additions & 4 deletions
Original file line number
Diff line number
Diff line change
@@ -386,11 +386,11 @@ enum memory_scope
386
386
387
387
}
388
388
----
389
-
An enumeration `memory_order` is described in section [atomics.order] of {cpp14} specification. [[ftnref7]] <<ftn7,[7]>>
389
+
An enumeration `memory_order` is described in section [atomics.order] of {cpp14} specification. <<ftn7,[7]>>
390
390
391
391
The enumerated type `memory_scope` specifies whether the memory ordering constraints given by `memory_order` apply to work-items in a work-group or work-items of a kernel(s) executing on the device or across devices (in the case of shared virtual memory). Its enumeration constants are as follows:
@@ -402,7 +402,7 @@ Atomic operations to local memory only guarantee memory ordering in the work-gro
402
402
NOTE: With fine-grained system SVM, sharing happens at the granularity of individual loads and stores anywhere in host memory.
403
403
Memory consistency is always guaranteed at synchronization points, but to obtain finer control over consistency, the OpenCL atomics functions may be used to ensure that the updates to individual data values made by
404
404
one unit of execution are visible to other execution units.
405
-
In particular, when a host thread needs fine control over the consistency of memory that is shared with one or more OpenCL devices, it must use atomic and fence operations that are compatible with the {cpp14} atomic operations [[ftnref9]] <<ftn9,[9]>>.
405
+
In particular, when a host thread needs fine control over the consistency of memory that is shared with one or more OpenCL devices, it must use atomic and fence operations that are compatible with the {cpp14} atomic operations <<ftn9,[9]>>.
406
406
407
407
[[atomic-lock-free-property]]
408
408
==== Atomic lock-free property
@@ -755,7 +755,7 @@ An application that wants to use 64-bit atomic types will need to define `cl_khr
755
755
[[restrictions-3]]
756
756
==== Restrictions
757
757
758
-
* The generic `atomic<T>` class template is only available if `T` is `int`, `uint`, `long`, `ulong` [[ftnref10]] <<ftn10,[10]>>, `float`, `double` [[ftnref11]] <<ftn11,[11]>>, `intptr_t` [[ftnref12]] <<ftn12,[12]>>, `uintptr_t`, `size_t`, `ptrdiff_t`.
758
+
* The generic `atomic<T>` class template is only available if `T` is `int`, `uint`, `long`, `ulong` <<ftn10,[10]>>, `float`, `double` <<ftn11,[11]>>, `intptr_t` <<ftn12,[12]>>, `uintptr_t`, `size_t`, `ptrdiff_t`.
759
759
* The `atomic_bool`, `atomic_char`, `atomic_uchar`, `atomic_short`, `atomic_ushort`, `atomic_intmax_t` and `atomic_uintmax_t` types are not supported by OpenCL {cpp}.
760
760
* OpenCL {cpp} requires that the built-in atomic functions on atomic types are lock-free.
761
761
* The atomic data types cannot be declared inside a kernel or non-kernel function unless they are declared as `static` keyword or in `local<T>` and `global<T>` containers.
Copy file name to clipboardExpand all lines: cxx/stdlib/conversions.txt
+2-2Lines changed: 2 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -48,7 +48,7 @@ T convert_cast(T const& arg);
48
48
[[data-types]]
49
49
==== Data Types
50
50
51
-
Conversions are available for the following scalar types: `bool`, `char`, `uchar`, `short`, `ushort`, `int`, `uint`, `long`, `ulong`, `half` [[ftnref4]] <<ftn4,[4]>>, `float`, `double`, and built-in vector types derived therefrom.
51
+
Conversions are available for the following scalar types: `bool`, `char`, `uchar`, `short`, `ushort`, `int`, `uint`, `long`, `ulong`, `half` <<ftn4,[4]>>, `float`, `double`, and built-in vector types derived therefrom.
52
52
The operand and result type must have the same number of elements.
53
53
The operand and result type may be the same type in which case the conversion has no effect on the type or value of an expression.
54
54
@@ -70,7 +70,7 @@ Conversions may have an optional rounding mode specified as described in the tab
70
70
| `rtn` | Round toward negative infinity
71
71
|====
72
72
73
-
If a rounding mode is not specified, conversions to integer type use the `rtz` (round toward zero) rounding mode and conversions to floating-point type [[ftnref5]] <<ftn5,[5]>> uses the `rte` rounding mode.
73
+
If a rounding mode is not specified, conversions to integer type use the `rtz` (round toward zero) rounding mode and conversions to floating-point type <<ftn5,[5]>> uses the `rte` rounding mode.
Copy file name to clipboardExpand all lines: cxx/stdlib/device_enqueue.txt
+2-2Lines changed: 2 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -416,8 +416,8 @@ It is defined as follows:
416
416
| `no_wait`
417
417
| Indicates that the enqueued kernels do not need to wait for the parent kernel to finish execution before they begin execution.
418
418
| `wait_kernel`
419
-
| Indicates that all work-items of the parent kernel must finish executing and all immediate [[ftnref14]] <<ftn14,[14]>> side effects committed before the enqueued child kernel may begin execution.
420
-
| `wait_work_group` [[ftnref15]] <<ftn15,[15]>>
419
+
| Indicates that all work-items of the parent kernel must finish executing and all immediate <<ftn14,[14]>> side effects committed before the enqueued child kernel may begin execution.
420
+
| `wait_work_group` <<ftn15,[15]>>
421
421
| Indicates that the enqueued kernels wait only for the work-group that enqueued the kernels to finish before they begin execution.
0 commit comments