|
| 1 | +:data-uri: |
| 2 | +:sectanchors: |
| 3 | +:icons: font |
| 4 | +:source-highlighter: coderay |
| 5 | + |
| 6 | += cl_intel_bfloat16_conversions |
| 7 | + |
| 8 | +== Name Strings |
| 9 | + |
| 10 | +`cl_intel_bfloat16_conversions` |
| 11 | + |
| 12 | +== Contact |
| 13 | + |
| 14 | +Ben Ashbaugh, Intel (ben 'dot' ashbaugh 'at' intel 'dot' com) |
| 15 | + |
| 16 | +== Contributors |
| 17 | + |
| 18 | +// spell-checker: disable |
| 19 | +Ben Ashbaugh, Intel + |
| 20 | +Alexey Sotkin, Intel + |
| 21 | +Lukasz Towarek, Intel |
| 22 | +// spell-checker: enable |
| 23 | + |
| 24 | +== Notice |
| 25 | + |
| 26 | +Copyright (c) 2022 Intel Corporation. All rights reserved. |
| 27 | + |
| 28 | +== Status |
| 29 | + |
| 30 | +Shipping |
| 31 | + |
| 32 | +== Version |
| 33 | + |
| 34 | +Built On: {docdate} + |
| 35 | +Revision: 1.0.0 |
| 36 | + |
| 37 | +== Dependencies |
| 38 | + |
| 39 | +This extension is written against the OpenCL 3.0 C Language specification and the OpenCL SPIR-V Environment specification, V3.0.8. |
| 40 | + |
| 41 | +This extension requires OpenCL 1.0. |
| 42 | + |
| 43 | +== Overview |
| 44 | + |
| 45 | +This extension adds built-in functions to convert between single-precision 32-bit floating-point values and 16-bit `bfloat16` values. |
| 46 | +The 16-bit `bfloat16` format has similar dynamic range as the 32-bit `float` format, albeit with lower precision than the 16-bit `half` format. |
| 47 | + |
| 48 | +Please note that this extension currently does not introduce a `bfloat16` type to OpenCL C and instead the built-in functions convert to or from a `ushort` 16-bit unsigned integer type with a bit pattern that represents a `bfloat16` value. |
| 49 | + |
| 50 | +== New API Functions |
| 51 | + |
| 52 | +None. |
| 53 | + |
| 54 | +== New API Enums |
| 55 | + |
| 56 | +None. |
| 57 | + |
| 58 | +== New API Types |
| 59 | + |
| 60 | +None. |
| 61 | + |
| 62 | +== New OpenCL C Functions |
| 63 | + |
| 64 | +[source] |
| 65 | +---- |
| 66 | +ushort intel_convert_bfloat16_as_ushort(float source); |
| 67 | +ushort2 intel_convert_bfloat162_as_ushort2(float2 source); |
| 68 | +ushort3 intel_convert_bfloat163_as_ushort3(float3 source); |
| 69 | +ushort4 intel_convert_bfloat164_as_ushort4(float4 source); |
| 70 | +ushort8 intel_convert_bfloat168_as_ushort8(float8 source); |
| 71 | +ushort16 intel_convert_bfloat1616_as_ushort16(float16 source); |
| 72 | +
|
| 73 | +float intel_convert_as_bfloat16_float(ushort source); |
| 74 | +float2 intel_convert_as_bfloat162_float2(ushort2 source); |
| 75 | +float3 intel_convert_as_bfloat163_float3(ushort3 source); |
| 76 | +float4 intel_convert_as_bfloat164_float4(ushort4 source); |
| 77 | +float8 intel_convert_as_bfloat168_float8(ushort8 source); |
| 78 | +float16 intel_convert_as_bfloat1616_float16(ushort16 source); |
| 79 | +---- |
| 80 | + |
| 81 | +== Modifications to the OpenCL C Specification |
| 82 | + |
| 83 | +=== Add a new Section 6.3.1.X - The `bfloat16` Format |
| 84 | + |
| 85 | +The `bfloat16` format is a floating-point format occupying 16 bits. |
| 86 | +It is a truncated version of the 32-bit IEEE 754 single-precision floating-point format. |
| 87 | +The `bfloat16` format includes one sign bit, eight exponent bits (same as the 32-bit single-precision floating-point format), and 7 mantissa bits (fewer than the 16-bit IEEE 754-2008 half-precision floating-point format). |
| 88 | +This means that a `bfloat16` number may represent numeric values with a similar dynamic range as a 32-bit `float` number, albeit with lower precision than a 16-bit `half` number. |
| 89 | + |
| 90 | +The `cl_intel_bfloat16_conversions` extension does not add `bfloat16` as a supported data type for OpenCL kernels, however the built-in functions added by the extension are able to use and return `bfloat16` data. |
| 91 | +For these built-in functions, the `bfloat16` data is passed to the function or returned from the function by encoding it into a `ushort` 16-bit unsigned integer data type. |
| 92 | +If a future extension adds `bfloat16` as a supported data type for OpenCL kernels, the `bfloat16` data may be reinterpreted and passed to the built-in functions added by `cl_intel_bfloat16_conversions` using the *as_type()* operator. |
| 93 | + |
| 94 | +=== Add a new Section 6.4.X - `bfloat16` Conversions |
| 95 | + |
| 96 | +The `bfloat16` format can be used in explicit conversions using the following suite of functions: |
| 97 | + |
| 98 | +[source] |
| 99 | +---- |
| 100 | +// conversions to bfloat16: |
| 101 | +destType intel_convert_bfloat16_as_destType(sourceType) |
| 102 | +destTypen intel_convert_bfloat16n_as_destTypen(sourceTypen) |
| 103 | +
|
| 104 | +// conversions from bfloat16: |
| 105 | +destType intel_convert_as_bfloat16_destType(sourceType) |
| 106 | +destTypen intel_convert_as_bfloat16n_destTypen(sourceType) |
| 107 | +---- |
| 108 | + |
| 109 | +The number of elements in the source and destination vectors must match. |
| 110 | + |
| 111 | +The only supported rounding mode is implicitly round-to-nearest-even. |
| 112 | +No explicit rounding modes are supported. |
| 113 | + |
| 114 | +Supported scalar and vector data types: |
| 115 | + |
| 116 | +[width="100%",options="header"] |
| 117 | +|==== |
| 118 | +| destType | sourceType |
| 119 | + |
| 120 | +| `bfloat16` (as `ushort`) |
| 121 | + | `float` |
| 122 | + |
| 123 | +| `bfloat162`, `bfloat163`, `bfloat164`, `bfloat168`, `bfloat1616` + |
| 124 | + (as `ushort2`, `ushort3`, `ushort4`, `ushort8`, `ushort16`) |
| 125 | + | `float2`, `float3`, `float4`, `float8`, `float16` |
| 126 | + |
| 127 | +| `float` |
| 128 | + | `bfloat16` (as `ushort`) |
| 129 | + |
| 130 | +| `float2`, `float3`, `float4`, `float8`, `float16` |
| 131 | + | `bfloat162`, `bfloat163`, `bfloat164`, `bfloat168`, `bfloat1616` + |
| 132 | + (as `ushort2`, `ushort3`, `ushort4`, `ushort8`, `ushort16`) |
| 133 | + |
| 134 | +|==== |
| 135 | + |
| 136 | +== Modifications to the OpenCL SPIR-V Environment Specification |
| 137 | + |
| 138 | +=== Add a new section 5.2.X - `cl_intel_bfloat16_conversions` |
| 139 | + |
| 140 | +If the OpenCL environment supports the extension `cl_intel_bfloat16_conversions` then the environment must accept modules that declare use of the extension `SPV_INTEL_bfloat16_conversion` and that declare the SPIR-V capability *Bfloat16ConversionINTEL*. |
| 141 | + |
| 142 | +For the instructions *OpConvertFToBF16INTEL* and *OpConvertBF16ToFINTEL* added by the extension: |
| 143 | + |
| 144 | + * Valid types for _Result Type_, _Float Value_, and _Bfloat16 Value_ are Scalars and *OpTypeVectors* with 2, 3, 4, 8, or 16 _Component Count_ components |
| 145 | + |
| 146 | +== Issues |
| 147 | + |
| 148 | +. Should these functions have a special prefix (such as `+__+`) or suffix (such as `+_as_ushort+`) since they do not truly operate on a `bfloat16` type? |
| 149 | ++ |
| 150 | +-- |
| 151 | +*RESOLVED*: Yes, we will use the `+_as_ushort+` nomenclature. |
| 152 | + |
| 153 | +The function name to convert to a `ushort` representing a `bfloat16` value is `intel_convert_bfloat16_as_ushort`. |
| 154 | + |
| 155 | +The function name to convert from a `ushort` representing a `bfloat16` value is `intel_convert_as_bfloat16_float`. |
| 156 | +-- |
| 157 | + |
| 158 | +. Should we define a type alias for our `bfloat16` type or use `ushort` (or `short`) directly? |
| 159 | ++ |
| 160 | +-- |
| 161 | +*RESOLVED*: No, we will not define a type alias. |
| 162 | +-- |
| 163 | + |
| 164 | +. Should the integer `bfloat16` representation be signed or unsigned? |
| 165 | ++ |
| 166 | +-- |
| 167 | +*RESOLVED*: We will use an unsigned type. |
| 168 | +-- |
| 169 | + |
| 170 | +. Should we support vector conversion built-in functions? |
| 171 | ++ |
| 172 | +-- |
| 173 | +*RESOLVED*: Yes, we will support the vector conversion built-in functions for consistency. |
| 174 | +-- |
| 175 | + |
| 176 | +. Should we support built-in functions with explicit rounding modes? |
| 177 | ++ |
| 178 | +-- |
| 179 | +*RESOLVED*: No, we will not support the built-in functions with explicit rounding modes for the initial version of this extension. |
| 180 | + |
| 181 | +The only supported rounding mode for the conversion from `float` to `bfloat16` will be the implicit round-to-nearest-even rounding mode. |
| 182 | + |
| 183 | +The conversions from `bfloat16` to `float` are lossless. |
| 184 | +-- |
| 185 | + |
| 186 | +. Do we need to support packed conversions? |
| 187 | ++ |
| 188 | +-- |
| 189 | +*RESOLVED*: No, we will not support packed conversions for the initial version of this extension. |
| 190 | +If we decide to add packed conversions we will also need to add them to the SPIR-V extension. |
| 191 | +-- |
| 192 | + |
| 193 | +. Do we need to say anything about out-of-range conversions? |
| 194 | ++ |
| 195 | +-- |
| 196 | +*RESOLVED*: No, out-of-range behavior is covered by existing rounding rules. |
| 197 | +-- |
| 198 | + |
| 199 | +. How should we name the vector conversion functions? |
| 200 | ++ |
| 201 | +-- |
| 202 | +*RESOLVED*: The name of the vector conversion functions will be `intel_convert_bfloat16__n___as_ushort__n__` and `intel_convert_as_bfloat16__n___float__n__`. |
| 203 | +This is consistent with the naming of the existing conversion functions. |
| 204 | + |
| 205 | +Because `bfloat16` ends with a number this does lead to awkward function names like `intel_convert_bfloat1616_as_ushort16`, but the awkward-ness is preferable to the ambiguity without the vector size suffix. |
| 206 | + |
| 207 | +If we decide to add a true `bfloat16` type we should consider other names that do not end in a number (`bfloat16_t`?). |
| 208 | +-- |
| 209 | + |
| 210 | +== Revision History |
| 211 | + |
| 212 | +[cols="5,15,15,70"] |
| 213 | +[grid="rows"] |
| 214 | +[options="header"] |
| 215 | +|======================================== |
| 216 | +|Version|Date|Author|Changes |
| 217 | +|0.9.0|2021-09-03|Ben Ashbaugh|*Initial revision* |
| 218 | +|0.9.0|2021-10-01|Ben Ashbaugh|Reduced scope, resolved all open issues. |
| 219 | +|0.9.0|2021-10-19|Ben Ashbaugh|Fixed the names of the vector conversion functions. |
| 220 | +|1.0.0|2022-08-26|Ben Ashbaugh|Updated version. |
| 221 | +|======================================== |
| 222 | + |
| 223 | + |
| 224 | +//************************************************************************ |
| 225 | +//Other formatting suggestions: |
| 226 | +// |
| 227 | +//* Use *bold* text for host APIs, or [source] syntax highlighting. |
| 228 | +//* Use `mono` text for device APIs, or [source] syntax highlighting. |
| 229 | +//* Use `mono` text for extension names, types, or enum values. |
| 230 | +//* Use _italics_ for parameters. |
| 231 | +//************************************************************************ |
0 commit comments