Skip to content

[AArch64] Add 9.7 CVT data processing intrinsics#186807

Open
MartinWehking wants to merge 6 commits intollvm:mainfrom
MartinWehking:data-intrinsics
Open

[AArch64] Add 9.7 CVT data processing intrinsics#186807
MartinWehking wants to merge 6 commits intollvm:mainfrom
MartinWehking:data-intrinsics

Conversation

@MartinWehking
Copy link
Copy Markdown
Contributor

@MartinWehking MartinWehking commented Mar 16, 2026

Add Clang intrinsics
svcvtt_f16_s8, _f32_s16, _f64_s32, _f16_u8, _f32_u16, _f64_u32
svcvtb_f16_s8, _f32_s16, _f64_s32, _f16_u8, _f32_u16, _f64_u32

  • lowering to AARCH64 Instrs. SCVTF, SCVTFLT, UCVTF, UCVTFLT

and Clang instrinsics:
svcvtzn_s8[_f16_x2], _s32[_f64_x2], _u8[_f16_x2], _u16[_f32_x2], _u32[_f64_x2]

  • lowering to AARCH64 Instrs. FCVTZSN, FCVTZUN

The Clang intrinsics are guarded by the sve2.3 and sme2.3 feature flags.

ACLE Patch:
ARM-software/acle#428

Add Clang/LLVM intrinsics for svcvt, scvtflt, ucvtf, ucvtflt and fcvtzsn,
fcvtzun.
The Clang intrinsics are guarded by the sve2.3 and sme2.3 feature flags.

ACLE Patch:
ARM-software/acle#428
@llvmbot llvmbot added backend:AArch64 clang:frontend Language frontend issues, e.g. anything involving "Sema" llvm:ir labels Mar 16, 2026
@llvmbot
Copy link
Copy Markdown
Member

llvmbot commented Mar 16, 2026

@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-backend-aarch64

Author: Martin Wehking (MartinWehking)

Changes

Add Clang/LLVM intrinsics for svcvt, scvtflt, ucvtf, ucvtflt and fcvtzsn, fcvtzun.
The Clang intrinsics are guarded by the sve2.3 and sme2.3 feature flags.

ACLE Patch:
ARM-software/acle#428


Patch is 39.30 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/186807.diff

8 Files Affected:

  • (modified) clang/include/clang/Basic/arm_sve.td (+27)
  • (added) clang/test/CodeGen/AArch64/sve2p3-intrinsics/acle_sve2_fp_int_cvtn_x2.c (+105)
  • (added) clang/test/CodeGen/AArch64/sve2p3-intrinsics/acle_sve2_int_fp_cvt.c (+189)
  • (modified) llvm/include/llvm/IR/IntrinsicsAArch64.td (+33)
  • (modified) llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td (+6-6)
  • (modified) llvm/lib/Target/AArch64/SVEInstrFormats.td (+13-2)
  • (added) llvm/test/CodeGen/AArch64/sve2p3-intrinsics-fp-converts.ll (+255)
  • (added) llvm/test/CodeGen/AArch64/sve2p3-intrinsics-fp-converts_x2.ll (+157)
diff --git a/clang/include/clang/Basic/arm_sve.td b/clang/include/clang/Basic/arm_sve.td
index be3cd8a76503b..852cc60c6e0b3 100644
--- a/clang/include/clang/Basic/arm_sve.td
+++ b/clang/include/clang/Basic/arm_sve.td
@@ -997,6 +997,33 @@ def SVCVTLT_Z_F32_F16  : SInst<"svcvtlt_f32[_f16]", "dPh", "f", MergeZeroExp, "a
 def SVCVTLT_Z_F64_F32  : SInst<"svcvtlt_f64[_f32]", "dPh", "d", MergeZeroExp, "aarch64_sve_fcvtlt_f64f32",  [IsOverloadNone, VerifyRuntimeMode]>;
 
 }
+
+let SVETargetGuard = "sve2p3|sme2p3", SMETargetGuard = "sve2p3|sme2p3" in {
+def SVCVT_S8_F16  : SInst<"svcvt_s8[_f16_x2]",  "d2.O", "c", MergeNone, "aarch64_sve_fcvtzsn", [IsOverloadWhileOrMultiVecCvt, VerifyRuntimeMode]>;
+def SVCVT_S16_F32  : SInst<"svcvt_s16[_f32_x2]",  "d2.M", "s", MergeNone, "aarch64_sve_fcvtzsn", [IsOverloadWhileOrMultiVecCvt, VerifyRuntimeMode]>;
+def SVCVT_S32_F64  : SInst<"svcvt_s32[_f64_x2]",  "d2.N", "i", MergeNone, "aarch64_sve_fcvtzsn", [IsOverloadWhileOrMultiVecCvt, VerifyRuntimeMode]>;
+
+def SVCVT_U8_F16  : SInst<"svcvt_u8[_f16_x2]",  "d2.O", "Uc", MergeNone, "aarch64_sve_fcvtzun", [IsOverloadWhileOrMultiVecCvt, VerifyRuntimeMode]>;
+def SVCVT_U16_F32  : SInst<"svcvt_u16[_f32_x2]",  "d2.M", "Us", MergeNone, "aarch64_sve_fcvtzun", [IsOverloadWhileOrMultiVecCvt, VerifyRuntimeMode]>;
+def SVCVT_U32_F64  : SInst<"svcvt_u32[_f64_x2]",  "d2.N", "Ui", MergeNone, "aarch64_sve_fcvtzun", [IsOverloadWhileOrMultiVecCvt, VerifyRuntimeMode]>;
+
+def SVCVTT_F16_S8  : SInst<"svcvtt_f16[_s8]",  "Od", "c", MergeNone, "aarch64_sve_scvtflt_f16i8", [IsOverloadNone, VerifyRuntimeMode]>;
+def SVCVTT_F32_S16  : SInst<"svcvtt_f32[_s16]",  "Md", "s", MergeNone, "aarch64_sve_scvtflt_f32i16", [IsOverloadNone, VerifyRuntimeMode]>;
+def SVCVTT_F64_S32  : SInst<"svcvtt_f64[_s32]",  "Nd", "i", MergeNone, "aarch64_sve_scvtflt_f64i32", [IsOverloadNone, VerifyRuntimeMode]>;
+
+def SVCVTT_F16_U8  : SInst<"svcvtt_f16[_u8]",  "Od", "Uc", MergeNone, "aarch64_sve_ucvtflt_f16i8", [IsOverloadNone, VerifyRuntimeMode]>;
+def SVCVTT_F32_U16  : SInst<"svcvtt_f32[_u16]",  "Md", "Us", MergeNone, "aarch64_sve_ucvtflt_f32i16", [IsOverloadNone, VerifyRuntimeMode]>;
+def SVCVTT_F64_U32  : SInst<"svcvtt_f64[_u32]",  "Nd", "Ui", MergeNone, "aarch64_sve_ucvtflt_f64i32", [IsOverloadNone, VerifyRuntimeMode]>;
+
+def SVCVTB_F16_S8  : SInst<"svcvtb_f16[_s8]",  "Od", "c", MergeNone, "aarch64_sve_scvtfb_f16i8", [IsOverloadNone, VerifyRuntimeMode]>;
+def SVCVTB_F32_S16  : SInst<"svcvtb_f32[_s16]",  "Md", "s", MergeNone, "aarch64_sve_scvtfb_f32i16", [IsOverloadNone, VerifyRuntimeMode]>;
+def SVCVTB_F64_S32  : SInst<"svcvtb_f64[_s32]",  "Nd", "i", MergeNone, "aarch64_sve_scvtfb_f64i32", [IsOverloadNone, VerifyRuntimeMode]>;
+
+def SVCVTB_F16_U8  : SInst<"svcvtb_f16[_u8]",  "Od", "Uc", MergeNone, "aarch64_sve_ucvtfb_f16i8", [IsOverloadNone, VerifyRuntimeMode]>;
+def SVCVTB_F32_U16  : SInst<"svcvtb_f32[_u16]",  "Md", "Us", MergeNone, "aarch64_sve_ucvtfb_f32i16", [IsOverloadNone, VerifyRuntimeMode]>;
+def SVCVTB_F64_U32  : SInst<"svcvtb_f64[_u32]",  "Nd", "Ui", MergeNone, "aarch64_sve_ucvtfb_f64i32", [IsOverloadNone, VerifyRuntimeMode]>;
+}
+
 ////////////////////////////////////////////////////////////////////////////////
 // Permutations and selection
 
diff --git a/clang/test/CodeGen/AArch64/sve2p3-intrinsics/acle_sve2_fp_int_cvtn_x2.c b/clang/test/CodeGen/AArch64/sve2p3-intrinsics/acle_sve2_fp_int_cvtn_x2.c
new file mode 100644
index 0000000000000..a4a7c58e1ced9
--- /dev/null
+++ b/clang/test/CodeGen/AArch64/sve2p3-intrinsics/acle_sve2_fp_int_cvtn_x2.c
@@ -0,0 +1,105 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
+// RUN: %clang_cc1 -triple aarch64 -target-feature +sve -target-feature +sve2p3 -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -triple aarch64 -target-feature +sve -target-feature +sve2p3 -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
+
+// RUN: %clang_cc1 -triple aarch64 -target-feature +sme -target-feature +sme2p3 -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -triple aarch64 -target-feature +sme -target-feature +sme2p3 -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
+
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -target-feature +sve2p3\
+// RUN:   -S -disable-O0-optnone -Werror -Wall -o /dev/null %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sme -target-feature +sme2p3\
+// RUN:   -S -disable-O0-optnone -Werror -Wall -o /dev/null %s
+//
+// REQUIRES: aarch64-registered-target
+
+#include <arm_sve.h>
+
+#if defined __ARM_FEATURE_SME
+#define MODE_ATTR __arm_streaming
+#else
+#define MODE_ATTR
+#endif
+
+// CHECK-LABEL: @test_svcvt_s8_f16_x2(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 16 x i8> @llvm.aarch64.sve.fcvtzsn.nxv16i8.nxv8f16(<vscale x 8 x half> [[ZN_COERCE0:%.*]], <vscale x 8 x half> [[ZN_COERCE1:%.*]])
+// CHECK-NEXT:    ret <vscale x 16 x i8> [[TMP0]]
+//
+// CPP-CHECK-LABEL: @_Z20test_svcvt_s8_f16_x213svfloat16x2_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 16 x i8> @llvm.aarch64.sve.fcvtzsn.nxv16i8.nxv8f16(<vscale x 8 x half> [[ZN_COERCE0:%.*]], <vscale x 8 x half> [[ZN_COERCE1:%.*]])
+// CPP-CHECK-NEXT:    ret <vscale x 16 x i8> [[TMP0]]
+//
+svint8_t test_svcvt_s8_f16_x2(svfloat16x2_t zn) MODE_ATTR {
+  return svcvt_s8_f16_x2(zn);
+}
+
+// CHECK-LABEL: @test_svcvt_s16_f32_x2(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 8 x i16> @llvm.aarch64.sve.fcvtzsn.nxv8i16.nxv4f32(<vscale x 4 x float> [[ZN_COERCE0:%.*]], <vscale x 4 x float> [[ZN_COERCE1:%.*]])
+// CHECK-NEXT:    ret <vscale x 8 x i16> [[TMP0]]
+//
+// CPP-CHECK-LABEL: @_Z21test_svcvt_s16_f32_x213svfloat32x2_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 8 x i16> @llvm.aarch64.sve.fcvtzsn.nxv8i16.nxv4f32(<vscale x 4 x float> [[ZN_COERCE0:%.*]], <vscale x 4 x float> [[ZN_COERCE1:%.*]])
+// CPP-CHECK-NEXT:    ret <vscale x 8 x i16> [[TMP0]]
+//
+svint16_t test_svcvt_s16_f32_x2(svfloat32x2_t zn) MODE_ATTR {
+  return svcvt_s16_f32_x2(zn);
+}
+
+// CHECK-LABEL: @test_svcvt_s32_f64_x2(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 4 x i32> @llvm.aarch64.sve.fcvtzsn.nxv4i32.nxv2f64(<vscale x 2 x double> [[ZN_COERCE0:%.*]], <vscale x 2 x double> [[ZN_COERCE1:%.*]])
+// CHECK-NEXT:    ret <vscale x 4 x i32> [[TMP0]]
+//
+// CPP-CHECK-LABEL: @_Z21test_svcvt_s32_f64_x213svfloat64x2_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 4 x i32> @llvm.aarch64.sve.fcvtzsn.nxv4i32.nxv2f64(<vscale x 2 x double> [[ZN_COERCE0:%.*]], <vscale x 2 x double> [[ZN_COERCE1:%.*]])
+// CPP-CHECK-NEXT:    ret <vscale x 4 x i32> [[TMP0]]
+//
+svint32_t test_svcvt_s32_f64_x2(svfloat64x2_t zn) MODE_ATTR {
+  return svcvt_s32_f64_x2(zn);
+}
+
+// CHECK-LABEL: @test_svcvt_u8_f16_x2(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 16 x i8> @llvm.aarch64.sve.fcvtzun.nxv16i8.nxv8f16(<vscale x 8 x half> [[ZN_COERCE0:%.*]], <vscale x 8 x half> [[ZN_COERCE1:%.*]])
+// CHECK-NEXT:    ret <vscale x 16 x i8> [[TMP0]]
+//
+// CPP-CHECK-LABEL: @_Z20test_svcvt_u8_f16_x213svfloat16x2_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 16 x i8> @llvm.aarch64.sve.fcvtzun.nxv16i8.nxv8f16(<vscale x 8 x half> [[ZN_COERCE0:%.*]], <vscale x 8 x half> [[ZN_COERCE1:%.*]])
+// CPP-CHECK-NEXT:    ret <vscale x 16 x i8> [[TMP0]]
+//
+svuint8_t test_svcvt_u8_f16_x2(svfloat16x2_t zn) MODE_ATTR {
+  return svcvt_u8_f16_x2(zn);
+}
+
+// CHECK-LABEL: @test_svcvt_u16_f32_x2(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 8 x i16> @llvm.aarch64.sve.fcvtzun.nxv8i16.nxv4f32(<vscale x 4 x float> [[ZN_COERCE0:%.*]], <vscale x 4 x float> [[ZN_COERCE1:%.*]])
+// CHECK-NEXT:    ret <vscale x 8 x i16> [[TMP0]]
+//
+// CPP-CHECK-LABEL: @_Z21test_svcvt_u16_f32_x213svfloat32x2_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 8 x i16> @llvm.aarch64.sve.fcvtzun.nxv8i16.nxv4f32(<vscale x 4 x float> [[ZN_COERCE0:%.*]], <vscale x 4 x float> [[ZN_COERCE1:%.*]])
+// CPP-CHECK-NEXT:    ret <vscale x 8 x i16> [[TMP0]]
+//
+svuint16_t test_svcvt_u16_f32_x2(svfloat32x2_t zn) MODE_ATTR {
+  return svcvt_u16_f32_x2(zn);
+}
+
+// CHECK-LABEL: @test_svcvt_u32_f64_x2(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 4 x i32> @llvm.aarch64.sve.fcvtzun.nxv4i32.nxv2f64(<vscale x 2 x double> [[ZN_COERCE0:%.*]], <vscale x 2 x double> [[ZN_COERCE1:%.*]])
+// CHECK-NEXT:    ret <vscale x 4 x i32> [[TMP0]]
+//
+// CPP-CHECK-LABEL: @_Z21test_svcvt_u32_f64_x213svfloat64x2_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 4 x i32> @llvm.aarch64.sve.fcvtzun.nxv4i32.nxv2f64(<vscale x 2 x double> [[ZN_COERCE0:%.*]], <vscale x 2 x double> [[ZN_COERCE1:%.*]])
+// CPP-CHECK-NEXT:    ret <vscale x 4 x i32> [[TMP0]]
+//
+svuint32_t test_svcvt_u32_f64_x2(svfloat64x2_t zn) MODE_ATTR {
+  return svcvt_u32_f64_x2(zn);
+}
diff --git a/clang/test/CodeGen/AArch64/sve2p3-intrinsics/acle_sve2_int_fp_cvt.c b/clang/test/CodeGen/AArch64/sve2p3-intrinsics/acle_sve2_int_fp_cvt.c
new file mode 100644
index 0000000000000..6b7252e045e33
--- /dev/null
+++ b/clang/test/CodeGen/AArch64/sve2p3-intrinsics/acle_sve2_int_fp_cvt.c
@@ -0,0 +1,189 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
+// RUN: %clang_cc1 -triple aarch64 -target-feature +sve -target-feature +sve2p3 -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -triple aarch64 -target-feature +sve -target-feature +sve2p3 -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
+
+// RUN: %clang_cc1 -triple aarch64 -target-feature +sme -target-feature +sme2p3 -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -triple aarch64 -target-feature +sme -target-feature +sme2p3 -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
+
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -target-feature +sve2p3\
+// RUN:   -S -disable-O0-optnone -Werror -Wall -o /dev/null %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sme -target-feature +sme2p3\
+// RUN:   -S -disable-O0-optnone -Werror -Wall -o /dev/null %s
+//
+// REQUIRES: aarch64-registered-target
+
+#include <arm_sve.h>
+
+#if defined __ARM_FEATURE_SME
+#define MODE_ATTR __arm_streaming
+#else
+#define MODE_ATTR
+#endif
+
+// CHECK-LABEL: @test_svcvtb_f16_s8(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.aarch64.sve.scvtfb.f16i8(<vscale x 16 x i8> [[ZN:%.*]])
+// CHECK-NEXT:    ret <vscale x 8 x half> [[TMP0]]
+//
+// CPP-CHECK-LABEL: @_Z18test_svcvtb_f16_s8u10__SVInt8_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.aarch64.sve.scvtfb.f16i8(<vscale x 16 x i8> [[ZN:%.*]])
+// CPP-CHECK-NEXT:    ret <vscale x 8 x half> [[TMP0]]
+//
+svfloat16_t test_svcvtb_f16_s8(svint8_t zn) MODE_ATTR {
+  return svcvtb_f16_s8(zn);
+}
+
+// CHECK-LABEL: @test_svcvtb_f32_s16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 4 x float> @llvm.aarch64.sve.scvtfb.f32i16(<vscale x 8 x i16> [[ZN:%.*]])
+// CHECK-NEXT:    ret <vscale x 4 x float> [[TMP0]]
+//
+// CPP-CHECK-LABEL: @_Z19test_svcvtb_f32_s16u11__SVInt16_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 4 x float> @llvm.aarch64.sve.scvtfb.f32i16(<vscale x 8 x i16> [[ZN:%.*]])
+// CPP-CHECK-NEXT:    ret <vscale x 4 x float> [[TMP0]]
+//
+svfloat32_t test_svcvtb_f32_s16(svint16_t zn) MODE_ATTR {
+  return svcvtb_f32_s16(zn);
+}
+
+// CHECK-LABEL: @test_svcvtb_f64_s32(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 2 x double> @llvm.aarch64.sve.scvtfb.f64i32(<vscale x 4 x i32> [[ZN:%.*]])
+// CHECK-NEXT:    ret <vscale x 2 x double> [[TMP0]]
+//
+// CPP-CHECK-LABEL: @_Z19test_svcvtb_f64_s32u11__SVInt32_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 2 x double> @llvm.aarch64.sve.scvtfb.f64i32(<vscale x 4 x i32> [[ZN:%.*]])
+// CPP-CHECK-NEXT:    ret <vscale x 2 x double> [[TMP0]]
+//
+svfloat64_t test_svcvtb_f64_s32(svint32_t zn) MODE_ATTR {
+  return svcvtb_f64_s32(zn);
+}
+
+// CHECK-LABEL: @test_svcvtb_f16_u8(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.aarch64.sve.ucvtfb.f16i8(<vscale x 16 x i8> [[ZN:%.*]])
+// CHECK-NEXT:    ret <vscale x 8 x half> [[TMP0]]
+//
+// CPP-CHECK-LABEL: @_Z18test_svcvtb_f16_u8u11__SVUint8_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.aarch64.sve.ucvtfb.f16i8(<vscale x 16 x i8> [[ZN:%.*]])
+// CPP-CHECK-NEXT:    ret <vscale x 8 x half> [[TMP0]]
+//
+svfloat16_t test_svcvtb_f16_u8(svuint8_t zn) MODE_ATTR {
+  return svcvtb_f16_u8(zn);
+}
+
+// CHECK-LABEL: @test_svcvtb_f32_u16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 4 x float> @llvm.aarch64.sve.ucvtfb.f32i16(<vscale x 8 x i16> [[ZN:%.*]])
+// CHECK-NEXT:    ret <vscale x 4 x float> [[TMP0]]
+//
+// CPP-CHECK-LABEL: @_Z19test_svcvtb_f32_u16u12__SVUint16_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 4 x float> @llvm.aarch64.sve.ucvtfb.f32i16(<vscale x 8 x i16> [[ZN:%.*]])
+// CPP-CHECK-NEXT:    ret <vscale x 4 x float> [[TMP0]]
+//
+svfloat32_t test_svcvtb_f32_u16(svuint16_t zn) MODE_ATTR {
+  return svcvtb_f32_u16(zn);
+}
+
+// CHECK-LABEL: @test_svcvtb_f64_u32(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 2 x double> @llvm.aarch64.sve.ucvtfb.f64i32(<vscale x 4 x i32> [[ZN:%.*]])
+// CHECK-NEXT:    ret <vscale x 2 x double> [[TMP0]]
+//
+// CPP-CHECK-LABEL: @_Z19test_svcvtb_f64_u32u12__SVUint32_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 2 x double> @llvm.aarch64.sve.ucvtfb.f64i32(<vscale x 4 x i32> [[ZN:%.*]])
+// CPP-CHECK-NEXT:    ret <vscale x 2 x double> [[TMP0]]
+//
+svfloat64_t test_svcvtb_f64_u32(svuint32_t zn) MODE_ATTR {
+  return svcvtb_f64_u32(zn);
+}
+
+// CHECK-LABEL: @test_svcvt_f16_s8(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.aarch64.sve.scvtflt.f16i8(<vscale x 16 x i8> [[ZN:%.*]])
+// CHECK-NEXT:    ret <vscale x 8 x half> [[TMP0]]
+//
+// CPP-CHECK-LABEL: @_Z17test_svcvt_f16_s8u10__SVInt8_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.aarch64.sve.scvtflt.f16i8(<vscale x 16 x i8> [[ZN:%.*]])
+// CPP-CHECK-NEXT:    ret <vscale x 8 x half> [[TMP0]]
+//
+svfloat16_t test_svcvt_f16_s8(svint8_t zn) MODE_ATTR {
+  return svcvtt_f16_s8(zn);
+}
+
+// CHECK-LABEL: @test_svcvt_f32_s16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 4 x float> @llvm.aarch64.sve.scvtflt.f32i16(<vscale x 8 x i16> [[ZN:%.*]])
+// CHECK-NEXT:    ret <vscale x 4 x float> [[TMP0]]
+//
+// CPP-CHECK-LABEL: @_Z18test_svcvt_f32_s16u11__SVInt16_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 4 x float> @llvm.aarch64.sve.scvtflt.f32i16(<vscale x 8 x i16> [[ZN:%.*]])
+// CPP-CHECK-NEXT:    ret <vscale x 4 x float> [[TMP0]]
+//
+svfloat32_t test_svcvt_f32_s16(svint16_t zn) MODE_ATTR {
+  return svcvtt_f32_s16(zn);
+}
+
+// CHECK-LABEL: @test_svcvt_f64_s32(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 2 x double> @llvm.aarch64.sve.scvtflt.f64i32(<vscale x 4 x i32> [[ZN:%.*]])
+// CHECK-NEXT:    ret <vscale x 2 x double> [[TMP0]]
+//
+// CPP-CHECK-LABEL: @_Z18test_svcvt_f64_s32u11__SVInt32_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 2 x double> @llvm.aarch64.sve.scvtflt.f64i32(<vscale x 4 x i32> [[ZN:%.*]])
+// CPP-CHECK-NEXT:    ret <vscale x 2 x double> [[TMP0]]
+//
+svfloat64_t test_svcvt_f64_s32(svint32_t zn) MODE_ATTR {
+  return svcvtt_f64_s32(zn);
+}
+
+// CHECK-LABEL: @test_svcvt_f16_u8(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.aarch64.sve.ucvtflt.f16i8(<vscale x 16 x i8> [[ZN:%.*]])
+// CHECK-NEXT:    ret <vscale x 8 x half> [[TMP0]]
+//
+// CPP-CHECK-LABEL: @_Z17test_svcvt_f16_u8u11__SVUint8_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.aarch64.sve.ucvtflt.f16i8(<vscale x 16 x i8> [[ZN:%.*]])
+// CPP-CHECK-NEXT:    ret <vscale x 8 x half> [[TMP0]]
+//
+svfloat16_t test_svcvt_f16_u8(svuint8_t zn) MODE_ATTR {
+  return svcvtt_f16_u8(zn);
+}
+
+// CHECK-LABEL: @test_svcvt_f32_u16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 4 x float> @llvm.aarch64.sve.ucvtflt.f32i16(<vscale x 8 x i16> [[ZN:%.*]])
+// CHECK-NEXT:    ret <vscale x 4 x float> [[TMP0]]
+//
+// CPP-CHECK-LABEL: @_Z18test_svcvt_f32_u16u12__SVUint16_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 4 x float> @llvm.aarch64.sve.ucvtflt.f32i16(<vscale x 8 x i16> [[ZN:%.*]])
+// CPP-CHECK-NEXT:    ret <vscale x 4 x float> [[TMP0]]
+//
+svfloat32_t test_svcvt_f32_u16(svuint16_t zn) MODE_ATTR {
+  return svcvtt_f32_u16(zn);
+}
+
+// CHECK-LABEL: @test_svcvt_f64_u32(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 2 x double> @llvm.aarch64.sve.ucvtflt.f64i32(<vscale x 4 x i32> [[ZN:%.*]])
+// CHECK-NEXT:    ret <vscale x 2 x double> [[TMP0]]
+//
+// CPP-CHECK-LABEL: @_Z18test_svcvt_f64_u32u12__SVUint32_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 2 x double> @llvm.aarch64.sve.ucvtflt.f64i32(<vscale x 4 x i32> [[ZN:%.*]])
+// CPP-CHECK-NEXT:    ret <vscale x 2 x double> [[TMP0]]
+//
+svfloat64_t test_svcvt_f64_u32(svuint32_t zn) MODE_ATTR {
+  return svcvtt_f64_u32(zn);
+}
diff --git a/llvm/include/llvm/IR/IntrinsicsAArch64.td b/llvm/include/llvm/IR/IntrinsicsAArch64.td
index 75929cbc222ad..d9f7314740953 100644
--- a/llvm/include/llvm/IR/IntrinsicsAArch64.td
+++ b/llvm/include/llvm/IR/IntrinsicsAArch64.td
@@ -1051,6 +1051,7 @@ def llvm_nxv4i1_ty  : LLVMType<nxv4i1>;
 def llvm_nxv8i1_ty  : LLVMType<nxv8i1>;
 def llvm_nxv16i1_ty : LLVMType<nxv16i1>;
 def llvm_nxv16i8_ty : LLVMType<nxv16i8>;
+def llvm_nxv8i16_ty : LLVMType<nxv8i16>;
 def llvm_nxv4i32_ty : LLVMType<nxv4i32>;
 def llvm_nxv2i64_ty : LLVMType<nxv2i64>;
 def llvm_nxv8f16_ty : LLVMType<nxv8f16>;
@@ -2610,6 +2611,29 @@ def int_aarch64_sve_fmlslb_lane   : SVE2_3VectorArgIndexed_Long_Intrinsic;
 def int_aarch64_sve_fmlslt        : SVE2_3VectorArg_Long_Intrinsic;
 def int_aarch64_sve_fmlslt_lane   : SVE2_3VectorArgIndexed_Long_Intrinsic;
 
+//
+// SVE2 - Multi-vector narrowing convert to floating point
+//
+
+class Builtin_SVCVT_UNPRED<LLVMType OUT, LLVMType IN>
+    : DefaultAttrsIntrinsic<[OUT], [IN], [IntrNoMem]>;
+
+def int_aarch64_sve_scvtfb_f16i8: Builtin_SVCVT_UNPRED<llvm_nxv8f16_ty, llvm_nxv16i8_ty>;
+def int_aarch64_sve_scvtfb_f32i16: Builtin_SVCVT_UNPRED<llvm_nxv4f32_ty, llvm_nxv8i16_ty>;
+def int_aarch64_sve_scvtfb_f64i32: Builtin_SVCVT_UNPRED<llvm_nxv2f64_ty, llvm_nxv4i32_ty>;
+
+def int_aarch64_sve_scvtflt_f16i8: Builtin_SVCVT_UNPRED<llvm_nxv8f16_ty, llvm_nxv16i8_ty>;
+def int_aarch64_sve_scvtflt_f32i16: Builtin_SVCVT_UNPRED<llvm_nxv4f32_ty, llvm_nxv8i16_ty>;
+def int_aarch64_sve_scvtflt_f64i32: Builtin_SVCVT_UNPRED<llvm_nxv2f64_ty, llvm_nxv4i32_ty>;
+
+def int_aarch64_sve_ucvtfb_f16i8: Builtin_SVCVT_UNPRED<llvm_nxv8f16_ty, llvm_nxv16i8_ty>;
+def int_aarch64_sve_ucvtfb_f32i16: Builtin_SVCVT_UNPRED<llvm_nxv4f32_ty, llvm_nxv8i16_ty>;
+def int_aarch64_sve_ucvtfb_f64i32: Builtin_SVCVT_UNPRED<llvm_nxv2f64_ty, llvm_nxv4i32_ty>;
+
+def int_aarch64_sve_ucvtflt_f16i8: Builtin_SVCVT_UNPRED<llvm_nxv8f16_ty, llvm_nxv16i8_ty>;
+def int_aarch64_sve_ucvtflt_f32i16: Builtin_SVCVT_UNPRED<llvm_nxv4f32_ty, llvm_nxv8i16_ty>;
+def int_aarch64_sve_ucvtflt_f64i32: Builtin_SVCVT_UNPRED<llvm_nxv2f64_ty, llvm_nxv4i32_ty>;
+
 //
 // SVE2 - Floating-point integer binary logarithm
 //
@@ -3526,6 +3550,10 @@ let TargetPrefix = "aarch64" in {
                 [LLVMSubdivide2VectorType<0>, LLVMSubdivi...
[truncated]

[LLVMSubdivide2VectorType<0>, LLVMSubdivide2VectorType<0>],
[IntrNoMem]>;

class SVE2_CVT_VG2_Single_Intrinsic
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference between this and SVE2_CVT_VG2_SINGLE_Intrinsic on line 3418? Could that be re-used?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I already thought that I saw an intrinsic with the same name somewhere else.

Unfortunately not, I was trying to use the subdivide by 2 vector type, but the problem it throws compilation errors when the type changes (fp -> int)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did rename the intrinsic to "SVE2_CVT_VG2_Narrowing_Intrinsic". Please let me know if that naming is okay and if the typing makes sense

Copy link
Copy Markdown
Contributor

@jthackray jthackray left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are some negative tests required in clang/test/Sema/? e.g. passing svint16_t to svcvtb_f16_s8(). Also, some tests just with -target-feature +sve without sve2p3 or sme2p3 to check they're not allowed to be called?

@amilendra
Copy link
Copy Markdown
Contributor

You should re-generate the Sema tests and commit them when adding new SVE/SME clang builtins.
Command to generate the tests
python3 llvm-project/clang/utils/aarch64_builtins_test_generator.py --gen-streaming-guard-tests build/tools/clang/include/clang/Basic/arm_sve_builtins.json --out-dir llvm-project/clang/test/Sema/AArch64/

Comment thread llvm/include/llvm/IR/IntrinsicsAArch64.td Outdated
#else
#define MODE_ATTR
#endif

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably should add some overloading tests here as well. See 542d2a5 for something similar.

#ifdef SVE_OVERLOADED_FORMS
// A simple used,unused... macro, long enough to represent any SVE builtin.
#define SVE_ACLE_FUNC(A1,A2_UNUSED,A3,A4_UNUSED) A1##A3
#else
#define SVE_ACLE_FUNC(A1,A2,A3,A4) A1##A2##A3##A4
#endif

Copy link
Copy Markdown
Contributor Author

@MartinWehking MartinWehking Mar 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I noticed that I actually should have been more careful with how I set my overloading.
This looks wrong for example:

svcvtt_f16[_s8]

and

svcvtt_f16[_u8]

Copy link
Copy Markdown
Contributor Author

@MartinWehking MartinWehking Mar 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please let me know if this looks okay after the update

Copy link
Copy Markdown
Contributor Author

@MartinWehking MartinWehking Mar 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I think it is fine to add an overload like that, because we could call svcvtt_f16 with a signed and unsigned argument in that case. I think I would just have to drop the "isOverloadNone" flag

Edit:
I have changed it back in the latest commit. As far as I understand, the "isOverloadNone" flag does not matter in this case. The overload introduced by the square brackets should be resolved as a valid shortened form.
It is fine to have this implicitly introduced overload like here for example:
svcvtb_f64(svint32_t_val);
svcvtb_f64(svuint32_t_val);

@amilendra
Copy link
Copy Markdown
Contributor

Change the subject of the commit message to [AArch64] Add 9.7 CVT data processing intrinsics? There are other types of data processing (Arithmetic data processing and Absolute data processing) intrinsics to be added by separate patches.

Comment thread llvm/test/CodeGen/AArch64/sve2p3-intrinsics-fp-converts.ll Outdated
Comment thread llvm/test/CodeGen/AArch64/sve2p3-intrinsics-fp-converts.ll Outdated
Copy link
Copy Markdown
Contributor

@CarolineConcatto CarolineConcatto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update the commit message and add which prototypes are you implementing.

@MartinWehking MartinWehking changed the title [AArch64] Add 9.7 data processing intrinsics [AArch64] Add 9.7 CVT data processing intrinsics Mar 18, 2026
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 18, 2026

🐧 Linux x64 Test Results

  • 200150 tests passed
  • 6258 tests skipped

✅ The build succeeded and all tests passed.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 18, 2026

🪟 Windows x64 Test Results

  • 134643 tests passed
  • 4384 tests skipped

✅ The build succeeded and all tests passed.

@MartinWehking
Copy link
Copy Markdown
Contributor Author

Are some negative tests required in clang/test/Sema/? e.g. passing svint16_t to svcvtb_f16_s8(). Also, some tests just with -target-feature +sve without sve2p3 or sme2p3 to check they're not allowed to be called?

I autogenerated the Sema test with the command that @amilendra suggested. Do you think there should be more testing added beyond the autogenerated lines?

@MartinWehking
Copy link
Copy Markdown
Contributor Author

Can you update the commit message and add which prototypes are you implementing.

I've extended my comment. I hope that's ok

Adapt the test cases accordingly.
A clang intrinsic was renamed in the ACLE patch.
Change the name accordingly.

// CHECK-LABEL: @test_svcvtb_f64_s32(
// CHECK-NEXT: entry:
// CHECK-NEXT: [[TMP0:%.*]] = tail call <vscale x 2 x double> @llvm.aarch64.sve.scvtfb.f64i32(<vscale x 4 x i32> [[ZN:%.*]])
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am puzzle to why this one has f64i32, like it is scalar inputs and not nxv8i16.nxv4f32 as expected for scalable vectors.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw now, this is because the intrinsics are like this in Intrinsics.td

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we can leave the naming like it is?

Comment thread llvm/lib/Target/AArch64/SVEInstrFormats.td Outdated
Comment thread llvm/include/llvm/IR/IntrinsicsAArch64.td
Copy link
Copy Markdown
Contributor

@CarolineConcatto CarolineConcatto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:AArch64 clang:frontend Language frontend issues, e.g. anything involving "Sema" llvm:ir

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants