Skip to content

[SYCL] Add test to cover std::complex<float/double> mul/div#21622

Open
jinge90 wants to merge 30 commits intointel:syclfrom
jinge90:add_test_for_mulsc3_divsc3
Open

[SYCL] Add test to cover std::complex<float/double> mul/div#21622
jinge90 wants to merge 30 commits intointel:syclfrom
jinge90:add_test_for_mulsc3_divsc3

Conversation

@jinge90
Copy link
Copy Markdown
Contributor

@jinge90 jinge90 commented Mar 25, 2026

libdevice provides __mulsc3, __divsc3, __muldc3, __divdc3 to support complex number multiplication and division in 'no-fast-math' mode on Linux platform only. These 4 builtins are not invoked by user code but inserted by compiler when handling complex mul/div expression. This PR adds e2e test for these builtins explicitly, the testing logic is ported from compiler-rt test suite for normal CPU platform:
https://github.com/llvm/llvm-project/blob/main/compiler-rt/test/builtins/Unit/divsc3_test.c
https://github.com/llvm/llvm-project/blob/main/compiler-rt/test/builtins/Unit/divdc3_test.c
https://github.com/llvm/llvm-project/blob/main/compiler-rt/test/builtins/Unit/mulsc3_test.c
https://github.com/llvm/llvm-project/blob/main/compiler-rt/test/builtins/Unit/muldc3_test.c

@jinge90 jinge90 requested a review from a team as a code owner March 25, 2026 08:39
@jinge90 jinge90 requested a review from sergey-semenov March 25, 2026 08:39
@jinge90 jinge90 marked this pull request as draft March 25, 2026 08:39
@jinge90 jinge90 marked this pull request as ready for review March 26, 2026 02:50
jinge90 added 2 commits March 26, 2026 11:15
Signed-off-by: jinge90 <ge.jin@intel.com>
Signed-off-by: jinge90 <ge.jin@intel.com>
@jinge90
Copy link
Copy Markdown
Contributor Author

jinge90 commented Mar 26, 2026

Hi, @intel/llvm-reviewers-runtime and @sergey-semenov
Could you take a look at this PR?
Thanks very much.

@uditagarwal97 uditagarwal97 requested a review from Copilot March 26, 2026 17:06
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds new SYCL end-to-end coverage for device-side std::complex<float/double> multiplication and division, intended to exercise the libdevice/compiler-emitted __mul{sd}c3 / __div{sd}c3 builtins under “no-fast-math” behavior.

Changes:

  • Introduce shared complex mul/div test utilities and result validation helpers.
  • Add a comprehensive complex input matrix (incl. NaN/Inf/±0) and run mul/div tests for float and double.
  • Wire the new tests into the existing std_complex_math_* e2e test executables (non-Windows).

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
sycl/test-e2e/DeviceLib/std_complex_math_test.cpp Adds std::complex<float> mul/div e2e coverage using shared helpers and a large edge-case input set.
sycl/test-e2e/DeviceLib/std_complex_math_fp64_test.cpp Adds std::complex<double> mul/div e2e coverage using the shared helpers and input set.
sycl/test-e2e/DeviceLib/complex_utils.hpp New helper header providing classification/compare logic and SYCL kernels for mul/div matrix evaluation.

Comment on lines 8 to 10

#include "complex_utils.hpp"
#include "math_utils.hpp"
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is intended to exercise the complex mul/div builtins emitted in "no-fast-math" mode, but the RUN line here doesn’t pass -fno-fast-math (unlike std_complex_math_test.cpp and other DeviceLib math tests). Please define/use %{mathflags} and compile this test with -fno-fast-math to ensure the intended codegen is covered.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Comment on lines +1 to +9
#pragma once
#include <complex>
#include <sycl/detail/core.hpp>
enum { zero, non_zero, inf, NaN, non_zero_nan };
template <typename T> int complex_classify(std::complex<T> x) {
if (x == std::complex<T>(0, 0))
return zero;
if (std::isinf(x.real()) || std::isinf(x.imag()))
return inf;
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

complex_utils.hpp uses std::array, std::isnan, and std::isinf but only includes <complex>. Please add the missing standard headers (e.g., <array> and <cmath>) so this helper is self-contained and doesn’t rely on include order in the including test files.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 7 comments.

Comment on lines 1 to 3
// DEFINE: %{mathflags} = %if cl_options %{/clang:-fno-fast-math%} %else %{-fno-fast-math%}
// RUN: %{build} %{mathflags} -o %t1.out
// RUN: %{run} %t1.out
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description says the mul/div builtins being exercised are Linux-only, but this test only excludes Windows (#ifndef _WIN32). That means the new mul/div coverage will still run on other non-Windows platforms (e.g. Darwin) where those builtins may not be available. Consider adding an appropriate LIT guard (e.g. // REQUIRES: linux or // UNSUPPORTED: system-darwin) so the test only runs where the targeted device-lib support exists.

Copilot uses AI. Check for mistakes.
Comment on lines 1 to 4
// REQUIRES: aspect-fp64
// RUN: %{build} -o %t1.out
// DEFINE: %{mathflags} = %if cl_options %{/clang:-fno-fast-math%} %else %{-fno-fast-math%}
// RUN: %{build} %{mathflags} -o %t1.out
// RUN: %{run} %t1.out
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue here: the new mul/div builtins coverage is described as Linux-only, but the test is only gated with #ifndef _WIN32. Please add a LIT platform guard (e.g. // REQUIRES: linux or // UNSUPPORTED: system-darwin) so it doesn’t run on non-Linux non-Windows hosts by accident.

Copilot uses AI. Check for mistakes.
Comment on lines +234 to +237
complex<double> complex_input[] = {{1.e-6, 1.e-6},
{-1.e-6, 1.e-6},
{-1.e-6, -1.e-6},
{1.e-6, -1.e-6},
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as in the float test: complex_input doesn’t need external linkage. Consider making it static (or move to an anonymous namespace) since it’s only consumed within this test TU.

Copilot uses AI. Check for mistakes.
Comment on lines +300 to +307
std::complex<T> *complex_input) {
constexpr size_t OutputSize = InputSize * InputSize;
sycl::range<1> numOfMulInput{InputSize};
sycl::range<1> numOfMulOutput{OutputSize};
std::array<std::complex<T>, OutputSize> complex_mul_result;
{
sycl::buffer<std::complex<T>, 1> buffer_complex_mul(complex_input,
numOfMulInput);
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

device_complex_test_mul only reads from complex_input. Consider taking const std::complex<T>* (and creating the input buffer from a const pointer where possible) to make the API intent clear and prevent accidental mutation.

Suggested change
std::complex<T> *complex_input) {
constexpr size_t OutputSize = InputSize * InputSize;
sycl::range<1> numOfMulInput{InputSize};
sycl::range<1> numOfMulOutput{OutputSize};
std::array<std::complex<T>, OutputSize> complex_mul_result;
{
sycl::buffer<std::complex<T>, 1> buffer_complex_mul(complex_input,
numOfMulInput);
const std::complex<T> *complex_input) {
constexpr size_t OutputSize = InputSize * InputSize;
sycl::range<1> numOfMulInput{InputSize};
sycl::range<1> numOfMulOutput{OutputSize};
std::array<std::complex<T>, OutputSize> complex_mul_result;
{
sycl::buffer<const std::complex<T>, 1> buffer_complex_mul(complex_input,
numOfMulInput);

Copilot uses AI. Check for mistakes.
Comment on lines +340 to +347
std::complex<T> *complex_input) {
constexpr size_t OutputSize = InputSize * InputSize;
sycl::range<1> numOfDivInput{InputSize};
sycl::range<1> numOfDivOutput{OutputSize};
std::array<std::complex<T>, OutputSize> complex_div_result;
{
sycl::buffer<std::complex<T>, 1> buffer_complex_div(complex_input,
numOfDivInput);
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

device_complex_test_div only reads from complex_input. Consider taking const std::complex<T>* (and using a read-only buffer/accessor) to better express const-correctness.

Suggested change
std::complex<T> *complex_input) {
constexpr size_t OutputSize = InputSize * InputSize;
sycl::range<1> numOfDivInput{InputSize};
sycl::range<1> numOfDivOutput{OutputSize};
std::array<std::complex<T>, OutputSize> complex_div_result;
{
sycl::buffer<std::complex<T>, 1> buffer_complex_div(complex_input,
numOfDivInput);
const std::complex<T> *complex_input) {
constexpr size_t OutputSize = InputSize * InputSize;
sycl::range<1> numOfDivInput{InputSize};
sycl::range<1> numOfDivOutput{OutputSize};
std::array<std::complex<T>, OutputSize> complex_div_result;
{
sycl::buffer<const std::complex<T>, 1> buffer_complex_div(complex_input,
numOfDivInput);

Copilot uses AI. Check for mistakes.
Comment on lines +220 to +244
complex<float> complex_input[] = {{1.e-6, 1.e-6},
{-1.e-6, 1.e-6},
{-1.e-6, -1.e-6},
{1.e-6, -1.e-6},
{1.e+6, 1.e-6},
{-1.e+6, 1.e-6},
{-1.e+6, -1.e-6},
{1.e+6, -1.e-6},

{1.e-6, 1.e+6},
{-1.e-6, 1.e+6},
{-1.e-6, -1.e+6},
{1.e-6, -1.e+6},

{1.e+6, 1.e+6},
{-1.e+6, 1.e+6},
{-1.e+6, -1.e+6},
{1.e+6, -1.e+6},

{NAN, NAN},
{-INFINITY, NAN},
{-2, NAN},
{-1, NAN},
{-0.5, NAN},
{-0., NAN},
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this complex<float> initializer list, literals like 1.e-6 are double. With braced initialization ({...}) this triggers C++ narrowing rules, and values like 1.e-6 are not exactly representable as float, so this is likely to be a hard compile error (“narrowing conversion from 'double' to 'float'”). Use f-suffixed literals (e.g. 1.e-6f) and/or switch these elements to complex<float>(...) (parentheses) to avoid narrowing errors.

Suggested change
complex<float> complex_input[] = {{1.e-6, 1.e-6},
{-1.e-6, 1.e-6},
{-1.e-6, -1.e-6},
{1.e-6, -1.e-6},
{1.e+6, 1.e-6},
{-1.e+6, 1.e-6},
{-1.e+6, -1.e-6},
{1.e+6, -1.e-6},
{1.e-6, 1.e+6},
{-1.e-6, 1.e+6},
{-1.e-6, -1.e+6},
{1.e-6, -1.e+6},
{1.e+6, 1.e+6},
{-1.e+6, 1.e+6},
{-1.e+6, -1.e+6},
{1.e+6, -1.e+6},
{NAN, NAN},
{-INFINITY, NAN},
{-2, NAN},
{-1, NAN},
{-0.5, NAN},
{-0., NAN},
complex<float> complex_input[] = {{1.e-6f, 1.e-6f},
{-1.e-6f, 1.e-6f},
{-1.e-6f, -1.e-6f},
{1.e-6f, -1.e-6f},
{1.e+6f, 1.e-6f},
{-1.e+6f, 1.e-6f},
{-1.e+6f, -1.e-6f},
{1.e+6f, -1.e-6f},
{1.e-6f, 1.e+6f},
{-1.e-6f, 1.e+6f},
{-1.e-6f, -1.e+6f},
{1.e-6f, -1.e+6f},
{1.e+6f, 1.e+6f},
{-1.e+6f, 1.e+6f},
{-1.e+6f, -1.e+6f},
{1.e+6f, -1.e+6f},
{NAN, NAN},
{-INFINITY, NAN},
{-2.f, NAN},
{-1.f, NAN},
{-0.5f, NAN},
{-0.f, NAN},

Copilot uses AI. Check for mistakes.
jinge90 and others added 4 commits March 31, 2026 16:46
Signed-off-by: jinge90 <ge.jin@intel.com>
Signed-off-by: jinge90 <ge.jin@intel.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

Comment on lines +310 to +318
deviceQueue.submit([&](sycl::handler &cgh) {
auto complex_mul_access =
buffer_complex_mul.template get_access<sycl::access::mode::read>(cgh);
auto complex_mul_res_access =
buffer_complex_mul_res.template get_access<sycl::access::mode::write>(
cgh);
cgh.single_task<class DeviceComplexMulTest>([=]() {
size_t i, j;
for (i = 0; i < InputSize; ++i) {
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

device_complex_test_mul is a function template, but the kernel is always named DeviceComplexMulTest. If this helper ever gets instantiated more than once in the same translation unit (e.g., for both float and double, or different InputSize), the identical kernel name type can cause a SYCL kernel name collision/ODR violation. Consider making the kernel name depend on the template parameters (e.g., a templated kernel-name type) or using an unnamed kernel form if supported by the project/toolchain.

Copilot uses AI. Check for mistakes.
Comment on lines +327 to +335
size_t i, j;
for (i = 0; i < InputSize; ++i)
for (j = 0; j < InputSize; ++j) {
if (complex_compare_mul(complex_input[i], complex_input[j],
complex_mul_result[i * InputSize + j])) {
return 1;
}
}
return 0;
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both device_complex_test_mul/device_complex_test_div return 1 on the first mismatch and don’t report which (i,j) input pair failed or what the computed/expected values were. That makes failures hard to diagnose in CI. Consider accumulating a failure count (like the other tests in this directory) and/or printing the failing indices and values before returning.

Copilot uses AI. Check for mistakes.
Comment on lines +350 to +358
deviceQueue.submit([&](sycl::handler &cgh) {
auto complex_div_access =
buffer_complex_div.template get_access<sycl::access::mode::read>(cgh);
auto complex_div_res_access =
buffer_complex_div_res.template get_access<sycl::access::mode::write>(
cgh);
cgh.single_task<class DeviceComplexDivTest>([=]() {
size_t i, j;
for (i = 0; i < InputSize; ++i) {
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

device_complex_test_div is a function template, but the kernel is always named DeviceComplexDivTest. If this helper is instantiated more than once in the same translation unit (different T/InputSize), the repeated kernel name type can cause a SYCL kernel name collision/ODR violation. Consider parameterizing the kernel-name type on the template arguments (or using an unnamed kernel form).

Copilot uses AI. Check for mistakes.
jinge90 and others added 2 commits April 1, 2026 14:44
Signed-off-by: jinge90 <ge.jin@intel.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@jinge90
Copy link
Copy Markdown
Contributor Author

jinge90 commented Apr 1, 2026

Hi, @intel/llvm-reviewers-runtime and @sergey-semenov
This PR ported compiler-rt test for __mulsc3/dc3 and __divsc3/dc3 builtin to sycl for Linux platform, could you help review?
Thanks very much.

@jinge90 jinge90 requested a review from uditagarwal97 April 2, 2026 01:31
@jinge90
Copy link
Copy Markdown
Contributor Author

jinge90 commented Apr 2, 2026

Hi, @uditagarwal97
Could you help review this PR, Sergey is on vacation.
Thanks very much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants