[SYCL] Add test to cover std::complex<float/double> mul/div#21622
[SYCL] Add test to cover std::complex<float/double> mul/div#21622jinge90 wants to merge 30 commits intointel:syclfrom
Conversation
Signed-off-by: jinge90 <ge.jin@intel.com>
Signed-off-by: jinge90 <ge.jin@intel.com>
Signed-off-by: jinge90 <ge.jin@intel.com>
|
Hi, @intel/llvm-reviewers-runtime and @sergey-semenov |
There was a problem hiding this comment.
Pull request overview
Adds new SYCL end-to-end coverage for device-side std::complex<float/double> multiplication and division, intended to exercise the libdevice/compiler-emitted __mul{sd}c3 / __div{sd}c3 builtins under “no-fast-math” behavior.
Changes:
- Introduce shared complex mul/div test utilities and result validation helpers.
- Add a comprehensive complex input matrix (incl. NaN/Inf/±0) and run mul/div tests for
floatanddouble. - Wire the new tests into the existing
std_complex_math_*e2e test executables (non-Windows).
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| sycl/test-e2e/DeviceLib/std_complex_math_test.cpp | Adds std::complex<float> mul/div e2e coverage using shared helpers and a large edge-case input set. |
| sycl/test-e2e/DeviceLib/std_complex_math_fp64_test.cpp | Adds std::complex<double> mul/div e2e coverage using the shared helpers and input set. |
| sycl/test-e2e/DeviceLib/complex_utils.hpp | New helper header providing classification/compare logic and SYCL kernels for mul/div matrix evaluation. |
|
|
||
| #include "complex_utils.hpp" | ||
| #include "math_utils.hpp" |
There was a problem hiding this comment.
This test is intended to exercise the complex mul/div builtins emitted in "no-fast-math" mode, but the RUN line here doesn’t pass -fno-fast-math (unlike std_complex_math_test.cpp and other DeviceLib math tests). Please define/use %{mathflags} and compile this test with -fno-fast-math to ensure the intended codegen is covered.
| #pragma once | ||
| #include <complex> | ||
| #include <sycl/detail/core.hpp> | ||
| enum { zero, non_zero, inf, NaN, non_zero_nan }; | ||
| template <typename T> int complex_classify(std::complex<T> x) { | ||
| if (x == std::complex<T>(0, 0)) | ||
| return zero; | ||
| if (std::isinf(x.real()) || std::isinf(x.imag())) | ||
| return inf; |
There was a problem hiding this comment.
complex_utils.hpp uses std::array, std::isnan, and std::isinf but only includes <complex>. Please add the missing standard headers (e.g., <array> and <cmath>) so this helper is self-contained and doesn’t rely on include order in the including test files.
| // DEFINE: %{mathflags} = %if cl_options %{/clang:-fno-fast-math%} %else %{-fno-fast-math%} | ||
| // RUN: %{build} %{mathflags} -o %t1.out | ||
| // RUN: %{run} %t1.out |
There was a problem hiding this comment.
The PR description says the mul/div builtins being exercised are Linux-only, but this test only excludes Windows (#ifndef _WIN32). That means the new mul/div coverage will still run on other non-Windows platforms (e.g. Darwin) where those builtins may not be available. Consider adding an appropriate LIT guard (e.g. // REQUIRES: linux or // UNSUPPORTED: system-darwin) so the test only runs where the targeted device-lib support exists.
| // REQUIRES: aspect-fp64 | ||
| // RUN: %{build} -o %t1.out | ||
| // DEFINE: %{mathflags} = %if cl_options %{/clang:-fno-fast-math%} %else %{-fno-fast-math%} | ||
| // RUN: %{build} %{mathflags} -o %t1.out | ||
| // RUN: %{run} %t1.out |
There was a problem hiding this comment.
Same issue here: the new mul/div builtins coverage is described as Linux-only, but the test is only gated with #ifndef _WIN32. Please add a LIT platform guard (e.g. // REQUIRES: linux or // UNSUPPORTED: system-darwin) so it doesn’t run on non-Linux non-Windows hosts by accident.
| complex<double> complex_input[] = {{1.e-6, 1.e-6}, | ||
| {-1.e-6, 1.e-6}, | ||
| {-1.e-6, -1.e-6}, | ||
| {1.e-6, -1.e-6}, |
There was a problem hiding this comment.
Same as in the float test: complex_input doesn’t need external linkage. Consider making it static (or move to an anonymous namespace) since it’s only consumed within this test TU.
| std::complex<T> *complex_input) { | ||
| constexpr size_t OutputSize = InputSize * InputSize; | ||
| sycl::range<1> numOfMulInput{InputSize}; | ||
| sycl::range<1> numOfMulOutput{OutputSize}; | ||
| std::array<std::complex<T>, OutputSize> complex_mul_result; | ||
| { | ||
| sycl::buffer<std::complex<T>, 1> buffer_complex_mul(complex_input, | ||
| numOfMulInput); |
There was a problem hiding this comment.
device_complex_test_mul only reads from complex_input. Consider taking const std::complex<T>* (and creating the input buffer from a const pointer where possible) to make the API intent clear and prevent accidental mutation.
| std::complex<T> *complex_input) { | |
| constexpr size_t OutputSize = InputSize * InputSize; | |
| sycl::range<1> numOfMulInput{InputSize}; | |
| sycl::range<1> numOfMulOutput{OutputSize}; | |
| std::array<std::complex<T>, OutputSize> complex_mul_result; | |
| { | |
| sycl::buffer<std::complex<T>, 1> buffer_complex_mul(complex_input, | |
| numOfMulInput); | |
| const std::complex<T> *complex_input) { | |
| constexpr size_t OutputSize = InputSize * InputSize; | |
| sycl::range<1> numOfMulInput{InputSize}; | |
| sycl::range<1> numOfMulOutput{OutputSize}; | |
| std::array<std::complex<T>, OutputSize> complex_mul_result; | |
| { | |
| sycl::buffer<const std::complex<T>, 1> buffer_complex_mul(complex_input, | |
| numOfMulInput); |
| std::complex<T> *complex_input) { | ||
| constexpr size_t OutputSize = InputSize * InputSize; | ||
| sycl::range<1> numOfDivInput{InputSize}; | ||
| sycl::range<1> numOfDivOutput{OutputSize}; | ||
| std::array<std::complex<T>, OutputSize> complex_div_result; | ||
| { | ||
| sycl::buffer<std::complex<T>, 1> buffer_complex_div(complex_input, | ||
| numOfDivInput); |
There was a problem hiding this comment.
device_complex_test_div only reads from complex_input. Consider taking const std::complex<T>* (and using a read-only buffer/accessor) to better express const-correctness.
| std::complex<T> *complex_input) { | |
| constexpr size_t OutputSize = InputSize * InputSize; | |
| sycl::range<1> numOfDivInput{InputSize}; | |
| sycl::range<1> numOfDivOutput{OutputSize}; | |
| std::array<std::complex<T>, OutputSize> complex_div_result; | |
| { | |
| sycl::buffer<std::complex<T>, 1> buffer_complex_div(complex_input, | |
| numOfDivInput); | |
| const std::complex<T> *complex_input) { | |
| constexpr size_t OutputSize = InputSize * InputSize; | |
| sycl::range<1> numOfDivInput{InputSize}; | |
| sycl::range<1> numOfDivOutput{OutputSize}; | |
| std::array<std::complex<T>, OutputSize> complex_div_result; | |
| { | |
| sycl::buffer<const std::complex<T>, 1> buffer_complex_div(complex_input, | |
| numOfDivInput); |
| complex<float> complex_input[] = {{1.e-6, 1.e-6}, | ||
| {-1.e-6, 1.e-6}, | ||
| {-1.e-6, -1.e-6}, | ||
| {1.e-6, -1.e-6}, | ||
| {1.e+6, 1.e-6}, | ||
| {-1.e+6, 1.e-6}, | ||
| {-1.e+6, -1.e-6}, | ||
| {1.e+6, -1.e-6}, | ||
|
|
||
| {1.e-6, 1.e+6}, | ||
| {-1.e-6, 1.e+6}, | ||
| {-1.e-6, -1.e+6}, | ||
| {1.e-6, -1.e+6}, | ||
|
|
||
| {1.e+6, 1.e+6}, | ||
| {-1.e+6, 1.e+6}, | ||
| {-1.e+6, -1.e+6}, | ||
| {1.e+6, -1.e+6}, | ||
|
|
||
| {NAN, NAN}, | ||
| {-INFINITY, NAN}, | ||
| {-2, NAN}, | ||
| {-1, NAN}, | ||
| {-0.5, NAN}, | ||
| {-0., NAN}, |
There was a problem hiding this comment.
In this complex<float> initializer list, literals like 1.e-6 are double. With braced initialization ({...}) this triggers C++ narrowing rules, and values like 1.e-6 are not exactly representable as float, so this is likely to be a hard compile error (“narrowing conversion from 'double' to 'float'”). Use f-suffixed literals (e.g. 1.e-6f) and/or switch these elements to complex<float>(...) (parentheses) to avoid narrowing errors.
| complex<float> complex_input[] = {{1.e-6, 1.e-6}, | |
| {-1.e-6, 1.e-6}, | |
| {-1.e-6, -1.e-6}, | |
| {1.e-6, -1.e-6}, | |
| {1.e+6, 1.e-6}, | |
| {-1.e+6, 1.e-6}, | |
| {-1.e+6, -1.e-6}, | |
| {1.e+6, -1.e-6}, | |
| {1.e-6, 1.e+6}, | |
| {-1.e-6, 1.e+6}, | |
| {-1.e-6, -1.e+6}, | |
| {1.e-6, -1.e+6}, | |
| {1.e+6, 1.e+6}, | |
| {-1.e+6, 1.e+6}, | |
| {-1.e+6, -1.e+6}, | |
| {1.e+6, -1.e+6}, | |
| {NAN, NAN}, | |
| {-INFINITY, NAN}, | |
| {-2, NAN}, | |
| {-1, NAN}, | |
| {-0.5, NAN}, | |
| {-0., NAN}, | |
| complex<float> complex_input[] = {{1.e-6f, 1.e-6f}, | |
| {-1.e-6f, 1.e-6f}, | |
| {-1.e-6f, -1.e-6f}, | |
| {1.e-6f, -1.e-6f}, | |
| {1.e+6f, 1.e-6f}, | |
| {-1.e+6f, 1.e-6f}, | |
| {-1.e+6f, -1.e-6f}, | |
| {1.e+6f, -1.e-6f}, | |
| {1.e-6f, 1.e+6f}, | |
| {-1.e-6f, 1.e+6f}, | |
| {-1.e-6f, -1.e+6f}, | |
| {1.e-6f, -1.e+6f}, | |
| {1.e+6f, 1.e+6f}, | |
| {-1.e+6f, 1.e+6f}, | |
| {-1.e+6f, -1.e+6f}, | |
| {1.e+6f, -1.e+6f}, | |
| {NAN, NAN}, | |
| {-INFINITY, NAN}, | |
| {-2.f, NAN}, | |
| {-1.f, NAN}, | |
| {-0.5f, NAN}, | |
| {-0.f, NAN}, |
Signed-off-by: jinge90 <ge.jin@intel.com>
Signed-off-by: jinge90 <ge.jin@intel.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
| deviceQueue.submit([&](sycl::handler &cgh) { | ||
| auto complex_mul_access = | ||
| buffer_complex_mul.template get_access<sycl::access::mode::read>(cgh); | ||
| auto complex_mul_res_access = | ||
| buffer_complex_mul_res.template get_access<sycl::access::mode::write>( | ||
| cgh); | ||
| cgh.single_task<class DeviceComplexMulTest>([=]() { | ||
| size_t i, j; | ||
| for (i = 0; i < InputSize; ++i) { |
There was a problem hiding this comment.
device_complex_test_mul is a function template, but the kernel is always named DeviceComplexMulTest. If this helper ever gets instantiated more than once in the same translation unit (e.g., for both float and double, or different InputSize), the identical kernel name type can cause a SYCL kernel name collision/ODR violation. Consider making the kernel name depend on the template parameters (e.g., a templated kernel-name type) or using an unnamed kernel form if supported by the project/toolchain.
| size_t i, j; | ||
| for (i = 0; i < InputSize; ++i) | ||
| for (j = 0; j < InputSize; ++j) { | ||
| if (complex_compare_mul(complex_input[i], complex_input[j], | ||
| complex_mul_result[i * InputSize + j])) { | ||
| return 1; | ||
| } | ||
| } | ||
| return 0; |
There was a problem hiding this comment.
Both device_complex_test_mul/device_complex_test_div return 1 on the first mismatch and don’t report which (i,j) input pair failed or what the computed/expected values were. That makes failures hard to diagnose in CI. Consider accumulating a failure count (like the other tests in this directory) and/or printing the failing indices and values before returning.
| deviceQueue.submit([&](sycl::handler &cgh) { | ||
| auto complex_div_access = | ||
| buffer_complex_div.template get_access<sycl::access::mode::read>(cgh); | ||
| auto complex_div_res_access = | ||
| buffer_complex_div_res.template get_access<sycl::access::mode::write>( | ||
| cgh); | ||
| cgh.single_task<class DeviceComplexDivTest>([=]() { | ||
| size_t i, j; | ||
| for (i = 0; i < InputSize; ++i) { |
There was a problem hiding this comment.
device_complex_test_div is a function template, but the kernel is always named DeviceComplexDivTest. If this helper is instantiated more than once in the same translation unit (different T/InputSize), the repeated kernel name type can cause a SYCL kernel name collision/ODR violation. Consider parameterizing the kernel-name type on the template arguments (or using an unnamed kernel form).
Signed-off-by: jinge90 <ge.jin@intel.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
Hi, @intel/llvm-reviewers-runtime and @sergey-semenov |
|
Hi, @uditagarwal97 |
libdevice provides __mulsc3, __divsc3, __muldc3, __divdc3 to support complex number multiplication and division in 'no-fast-math' mode on Linux platform only. These 4 builtins are not invoked by user code but inserted by compiler when handling complex mul/div expression. This PR adds e2e test for these builtins explicitly, the testing logic is ported from compiler-rt test suite for normal CPU platform:
https://github.com/llvm/llvm-project/blob/main/compiler-rt/test/builtins/Unit/divsc3_test.c
https://github.com/llvm/llvm-project/blob/main/compiler-rt/test/builtins/Unit/divdc3_test.c
https://github.com/llvm/llvm-project/blob/main/compiler-rt/test/builtins/Unit/mulsc3_test.c
https://github.com/llvm/llvm-project/blob/main/compiler-rt/test/builtins/Unit/muldc3_test.c