Skip to content

GH-46856: [C++][Python] Add binary view comparison kernels#49964

Open
Periecle wants to merge 2 commits into
apache:mainfrom
Periecle:fix-binary-view-compare-kernels
Open

GH-46856: [C++][Python] Add binary view comparison kernels#49964
Periecle wants to merge 2 commits into
apache:mainfrom
Periecle:fix-binary-view-compare-kernels

Conversation

@Periecle
Copy link
Copy Markdown

@Periecle Periecle commented May 12, 2026

Rationale for this change

pyarrow.compute.equal fails for pa.binary_view() arrays because C++ compute has no registered comparison kernel for (binary_view, binary_view).

This fixes that missing kernel path and also enables the same comparisons for utf8_view.

What changes are included in this PR?

This adds comparison kernel support for binary_view and utf8_view.

The following functions now work for same-type inputs:

  • equal
  • not_equal
  • greater
  • greater_equal
  • less
  • less_equal

Are these changes tested?

Added C++ tests covering:

  • inline and out-of-line values
  • nulls
  • sliced arrays
  • array-array comparisons
  • array-scalar and scalar-array comparisons
  • all six comparison functions

Added Python regression tests for pa.binary_view() and pa.string_view().

Verified the same cases fail before this patch at a0d2885b101acb439f7f79ec2237028974e74e64 with ArrowNotImplementedError: no kernel matching input types.

Are there any user-facing changes?

pyarrow.compute comparison functions now work for pa.binary_view() and pa.string_view() arrays where they previously failed with a missing kernel error.

AI Usage

Tests were generated by LLM agents along with part or PR summary

Addresses: GH-46856
Partially addresses: GH-44336

@Periecle Periecle requested review from AlenkaF, raulcd and rok as code owners May 12, 2026 08:04
@github-actions
Copy link
Copy Markdown

⚠️ GitHub issue #46856 has been automatically assigned in GitHub to PR creator.

@github-actions
Copy link
Copy Markdown

⚠️ GitHub issue #46856 has been automatically assigned in GitHub to PR creator.

3 similar comments
@github-actions
Copy link
Copy Markdown

⚠️ GitHub issue #46856 has been automatically assigned in GitHub to PR creator.

@github-actions
Copy link
Copy Markdown

⚠️ GitHub issue #46856 has been automatically assigned in GitHub to PR creator.

@github-actions
Copy link
Copy Markdown

⚠️ GitHub issue #46856 has been automatically assigned in GitHub to PR creator.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds missing comparison kernel registrations for Arrow C++ compute so that binary_view and utf8_view arrays can be compared (and thus pyarrow.compute.* comparison functions work for pa.binary_view() / pa.string_view() inputs).

Changes:

  • Register (binary_view, binary_view) and (utf8_view, utf8_view) comparison kernels in C++ compute.
  • Extend internal kernel codegen iteration utilities to support binary-view-like arrays.
  • Add C++ and Python tests covering comparisons for view types (including nulls and slicing).

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
python/pyarrow/tests/test_compute.py Adds Python regression tests ensuring view-type arrays work with the 6 comparison functions.
cpp/src/arrow/compute/kernels/scalar_compare.cc Registers comparison kernels for binary_view and utf8_view.
cpp/src/arrow/compute/kernels/scalar_compare_test.cc Adds C++ test coverage for array/array, sliced arrays, and array/scalar comparisons on view types.
cpp/src/arrow/compute/kernels/codegen_internal.h Adds ArrayIterator support for binary-view-like arrays so generated kernels can read values correctly.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +436 to +441
for (const std::shared_ptr<DataType>& ty : {binary_view(), utf8_view()}) {
auto exec =
GenerateVarBinaryViewBase<applicator::ScalarBinaryEqualTypes, BooleanType, Op>(
*ty);
DCHECK_OK(func->AddKernel({ty, ty}, boolean(), std::move(exec)));
}
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks valid, applied

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants