The nvptx intrinsics f16x2_min/f16x2_max/f16x2_min_nan/f16x2_max_nan are currently being mapped to the LLVM intrinsics minnum/minimum/maxnum/maximum, respectively (in some cases this is indirected via simd_fmin/simd_fmax, which are documented to correspond to minnum nsz/maxnum nsz, but we currently don't actually emit the nsz attribute). See here for an overview of the LLVM float min/max operations.
This is incorrect:
- According to the docs, the behavior for signed zeros is defined by
(a < b) ? a : b, i.e., when both operands compare equal, the 2nd operand is returned. That's not what any of the LLVM intrinsics does: they either treat -0.0 as smaller than +0.0 (that's the default), or return either value non-deterministically (when the nsz attribute is present). [This means it is actually a bug that LLVM uses the min.f16x2 nvptx operation for lowering minnum...]
- According to the docs, assuming that
isNaN checks for both QNaN and SNaN, if exactly one input is any NaN, the other input is returned for f16x2_min/f16x2_max. In contrast, minnum/maxnum say that when an input is SNaN, the return value is a NaN or the other input. The LLVM variant with the correct NaN semantics is minimumnum/maximumnum.
Cc @kjetilkjeka @folkertdev
The nvptx intrinsics
f16x2_min/f16x2_max/f16x2_min_nan/f16x2_max_nanare currently being mapped to the LLVM intrinsicsminnum/minimum/maxnum/maximum, respectively (in some cases this is indirected viasimd_fmin/simd_fmax, which are documented to correspond tominnum nsz/maxnum nsz, but we currently don't actually emit thenszattribute). See here for an overview of the LLVM float min/max operations.This is incorrect:
(a < b) ? a : b, i.e., when both operands compare equal, the 2nd operand is returned. That's not what any of the LLVM intrinsics does: they either treat-0.0as smaller than+0.0(that's the default), or return either value non-deterministically (when thenszattribute is present). [This means it is actually a bug that LLVM uses themin.f16x2nvptx operation for loweringminnum...]isNaNchecks for both QNaN and SNaN, if exactly one input is any NaN, the other input is returned forf16x2_min/f16x2_max. In contrast,minnum/maxnumsay that when an input is SNaN, the return value is a NaN or the other input. The LLVM variant with the correct NaN semantics isminimumnum/maximumnum.Cc @kjetilkjeka @folkertdev