Skip to content

torchcodec-xpu: add XPU encoding support#58

Open
eromomon wants to merge 4 commits into
intel:mainfrom
eromomon:eromomon/encoding
Open

torchcodec-xpu: add XPU encoding support#58
eromomon wants to merge 4 commits into
intel:mainfrom
eromomon:eromomon/encoding

Conversation

@eromomon
Copy link
Copy Markdown
Contributor

Extends VideoEncoder to support Intel XPU devices.

Encoder.cpp: extended kStableCUDA device-type checks to also match kStableXPU in both VideoEncoder and MultiStreamEncoder, enabling the hardware encoding path (hw frame context setup, pixel format selection, device registration).
XpuDeviceInterface: implemented setupHardwareFrameContextForEncoding and convertTensorToAVFrameForEncoding. RGB→NV12 conversion is done via a SYCL kernel, or via libswscale as CPU fallback.

eromomon added 2 commits May 25, 2026 15:34
Signed-off-by: Edgar Romo Montiel <edgar.romo.montiel@intel.com>
Signed-off-by: Edgar Romo Montiel <edgar.romo.montiel@intel.com>
@eromomon eromomon requested review from dvrogozh and removed request for dvrogozh May 25, 2026 23:07
@dvrogozh dvrogozh changed the title Add XPU encoding support to Encoder torchcodec-xpu: add XPU encoding support May 26, 2026
// ============================================================
// Encoding: encodeConvert_SYCL
// ============================================================
UniqueAVFrame XpuDeviceInterface::encodeConvert_SYCL(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
UniqueAVFrame XpuDeviceInterface::encodeConvert_SYCL(
UniqueAVFrame XpuDeviceInterface::convertTensorToAVFrameForEncoding_SYCL(

Comment on lines +629 to +642
UniqueAVFrame vaFrame(av_frame_alloc());
TORCH_CHECK(vaFrame != nullptr, "Failed to allocate AVFrame for encoding");
vaFrame->format = AV_PIX_FMT_VAAPI;
vaFrame->height = static_cast<int>(tensor.sizes()[1]);
vaFrame->width = static_cast<int>(tensor.sizes()[2]);
vaFrame->pts = frameIndex;

// Allocate a VAAPI surface from the hw_frames_ctx pool created in
// setupHardwareFrameContextForEncoding.
int ret = av_hwframe_get_buffer(codecContext->hw_frames_ctx, vaFrame.get(), 0);
TORCH_CHECK(
ret >= 0,
"av_hwframe_get_buffer failed: ",
getFFMPEGErrorStringFromErrorCode(ret));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make a helper function out of this block:

UniqueAVFrame allocNV12Frame(int width, int height, int frameIndex) {...}

if (xpu::use_sycl_color_conversion_kernel()) {
VLOG(9) << "[XPU Encoder] Encoding frame " << frameIndex
<< " via SYCL on device=xpu:" << device_.index();
return encodeConvert_SYCL(tensor, codecContext, std::move(vaFrame));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Name functions as convertTensorToAVFrameForEncoding_SYCL and convertTensorToAVFrameForEncoding_CPU
  2. Do NOT pass std::move(avFrame) just to return it from the function - that's bad pattern. Instead allocate frame in the function. That's why you need a helper allocNV12Frame() to avoid duplicated code. So the functions prototype should be the same as original convertTensorToAVFrameForEncoding().
  3. You do not need WITH_SYCL_KERNELS here - you can handle all that inside the convertTensorToAVFrameForEncoding_SYCL(). I.e.:
UniqueAVFrame avFrame = convertTensorToAVFrameForEncoding_SYCL();
if (!avFrame) {
    avFrame = convertTensorToAVFrameForEncoding_CPU();
}

const torch::stable::Tensor& tensor,
AVCodecContext* codecContext,
UniqueAVFrame vaFrame) {
#ifdef WITH_SYCL_KERNELS
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do the same as in decoding path:

  if (!xpu::use_sycl_color_conversion_kernel()) {
    return nullptr;
  }
  if (!has_fp64_) {
    return nullptr;
  }
  UniquAVFrame avFrame;
#ifdef WITH_SYCL_KERNELS
  avFrame = allocNV12Frame();
....
#endif
  return avFrame;

// Layout A: 1 layer, 2 planes — layers[0].planes[0]=Y, layers[0].planes[1]=UV
// Layout B: 2 layers, 1 plane each — layers[0].planes[0]=Y, layers[1].planes[0]=UV
const bool layoutA = (desc.num_layers == 1 && desc.layers[0].num_planes == 2);
const bool layoutB = (desc.num_layers == 2 && desc.layers[0].num_planes == 1
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well. Yes, except that we don't have any other driver which has another layout... I am not sure that we should implement something which we never tested.


void registerHardwareDeviceWithCodec(AVCodecContext* codecContext) override;

// ---- Encoding overrides ----
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you added "Encoding overrides", then probably you need to add "Decoding overrides" as well.

eromomon added 2 commits May 28, 2026 08:25
How to reproduce FFmpeg RGB->YUV matrix values

1. Expose ff_fill_rgb2yuv_table in libavfilter/libavfilter.v:
   add "ff_fill_rgb2yuv_table;" under the global section.
   Example:
	libavfilter/libavfilter.v
	 LIBAVFILTER_MAJOR {
	     global:
	         avfilter_*;
	         av_*;
	+        ff_fill_rgb2yuv_table;
	     local:
	         *;
	 };

2. Rebuild FFmpeg:
   cd ffmpeg && ./configure && make -j$(nproc) && make install
   nm -D <prefix>/lib/libavfilter.so | grep ff_fill_rgb2yuv_table

3. Create rgb2yuv_test.c calling
   ff_fill_rgb2yuv_table(av_csp_luma_coeffs_from_avcsp(cs), m)
   for AVCOL_SPC_BT709, BT470BG.

4. Build:
   gcc rgb2yuv_test.c -o rgb2yuv_test \
       -I<prefix>/include -L<prefix>/lib \
       -lavfilter -lavutil -Wl,-rpath,<prefix>/lib

Signed-off-by: Edgar Romo Montiel <edgar.romo.montiel@intel.com>
How to reproduce FFmpeg RGB->YUV matrix values

1. Expose ff_fill_rgb2yuv_table in libavfilter/libavfilter.v:
   add "ff_fill_rgb2yuv_table;" under the global section.

2. Rebuild FFmpeg:
   cd ffmpeg && ./configure && make -j$(nproc) && make install
   nm -D <prefix>/lib/libavfilter.so | grep ff_fill_rgb2yuv_table

3. Create rgb2yuv_test.c calling
   ff_fill_rgb2yuv_table(av_csp_luma_coeffs_from_avcsp(cs), m)
   for AVCOL_SPC_BT709, BT470BG.

4. Build:
   gcc rgb2yuv_test.c -o rgb2yuv_test \
       -I<prefix>/include -L<prefix>/lib \
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants