torchcodec-xpu: add XPU encoding support#58
Open
eromomon wants to merge 4 commits into
Open
Conversation
Signed-off-by: Edgar Romo Montiel <edgar.romo.montiel@intel.com>
Signed-off-by: Edgar Romo Montiel <edgar.romo.montiel@intel.com>
dvrogozh
requested changes
May 26, 2026
| // ============================================================ | ||
| // Encoding: encodeConvert_SYCL | ||
| // ============================================================ | ||
| UniqueAVFrame XpuDeviceInterface::encodeConvert_SYCL( |
Contributor
There was a problem hiding this comment.
Suggested change
| UniqueAVFrame XpuDeviceInterface::encodeConvert_SYCL( | |
| UniqueAVFrame XpuDeviceInterface::convertTensorToAVFrameForEncoding_SYCL( |
Comment on lines
+629
to
+642
| UniqueAVFrame vaFrame(av_frame_alloc()); | ||
| TORCH_CHECK(vaFrame != nullptr, "Failed to allocate AVFrame for encoding"); | ||
| vaFrame->format = AV_PIX_FMT_VAAPI; | ||
| vaFrame->height = static_cast<int>(tensor.sizes()[1]); | ||
| vaFrame->width = static_cast<int>(tensor.sizes()[2]); | ||
| vaFrame->pts = frameIndex; | ||
|
|
||
| // Allocate a VAAPI surface from the hw_frames_ctx pool created in | ||
| // setupHardwareFrameContextForEncoding. | ||
| int ret = av_hwframe_get_buffer(codecContext->hw_frames_ctx, vaFrame.get(), 0); | ||
| TORCH_CHECK( | ||
| ret >= 0, | ||
| "av_hwframe_get_buffer failed: ", | ||
| getFFMPEGErrorStringFromErrorCode(ret)); |
Contributor
There was a problem hiding this comment.
Make a helper function out of this block:
UniqueAVFrame allocNV12Frame(int width, int height, int frameIndex) {...}
| if (xpu::use_sycl_color_conversion_kernel()) { | ||
| VLOG(9) << "[XPU Encoder] Encoding frame " << frameIndex | ||
| << " via SYCL on device=xpu:" << device_.index(); | ||
| return encodeConvert_SYCL(tensor, codecContext, std::move(vaFrame)); |
Contributor
There was a problem hiding this comment.
- Name functions as
convertTensorToAVFrameForEncoding_SYCLandconvertTensorToAVFrameForEncoding_CPU - Do NOT pass
std::move(avFrame)just to return it from the function - that's bad pattern. Instead allocate frame in the function. That's why you need a helperallocNV12Frame()to avoid duplicated code. So the functions prototype should be the same as originalconvertTensorToAVFrameForEncoding(). - You do not need
WITH_SYCL_KERNELShere - you can handle all that inside theconvertTensorToAVFrameForEncoding_SYCL(). I.e.:
UniqueAVFrame avFrame = convertTensorToAVFrameForEncoding_SYCL();
if (!avFrame) {
avFrame = convertTensorToAVFrameForEncoding_CPU();
}
| const torch::stable::Tensor& tensor, | ||
| AVCodecContext* codecContext, | ||
| UniqueAVFrame vaFrame) { | ||
| #ifdef WITH_SYCL_KERNELS |
Contributor
There was a problem hiding this comment.
Do the same as in decoding path:
if (!xpu::use_sycl_color_conversion_kernel()) {
return nullptr;
}
if (!has_fp64_) {
return nullptr;
}
UniquAVFrame avFrame;
#ifdef WITH_SYCL_KERNELS
avFrame = allocNV12Frame();
....
#endif
return avFrame;
| // Layout A: 1 layer, 2 planes — layers[0].planes[0]=Y, layers[0].planes[1]=UV | ||
| // Layout B: 2 layers, 1 plane each — layers[0].planes[0]=Y, layers[1].planes[0]=UV | ||
| const bool layoutA = (desc.num_layers == 1 && desc.layers[0].num_planes == 2); | ||
| const bool layoutB = (desc.num_layers == 2 && desc.layers[0].num_planes == 1 |
Contributor
There was a problem hiding this comment.
well. Yes, except that we don't have any other driver which has another layout... I am not sure that we should implement something which we never tested.
|
|
||
| void registerHardwareDeviceWithCodec(AVCodecContext* codecContext) override; | ||
|
|
||
| // ---- Encoding overrides ---- |
Contributor
There was a problem hiding this comment.
If you added "Encoding overrides", then probably you need to add "Decoding overrides" as well.
How to reproduce FFmpeg RGB->YUV matrix values
1. Expose ff_fill_rgb2yuv_table in libavfilter/libavfilter.v:
add "ff_fill_rgb2yuv_table;" under the global section.
Example:
libavfilter/libavfilter.v
LIBAVFILTER_MAJOR {
global:
avfilter_*;
av_*;
+ ff_fill_rgb2yuv_table;
local:
*;
};
2. Rebuild FFmpeg:
cd ffmpeg && ./configure && make -j$(nproc) && make install
nm -D <prefix>/lib/libavfilter.so | grep ff_fill_rgb2yuv_table
3. Create rgb2yuv_test.c calling
ff_fill_rgb2yuv_table(av_csp_luma_coeffs_from_avcsp(cs), m)
for AVCOL_SPC_BT709, BT470BG.
4. Build:
gcc rgb2yuv_test.c -o rgb2yuv_test \
-I<prefix>/include -L<prefix>/lib \
-lavfilter -lavutil -Wl,-rpath,<prefix>/lib
Signed-off-by: Edgar Romo Montiel <edgar.romo.montiel@intel.com>
How to reproduce FFmpeg RGB->YUV matrix values
1. Expose ff_fill_rgb2yuv_table in libavfilter/libavfilter.v:
add "ff_fill_rgb2yuv_table;" under the global section.
2. Rebuild FFmpeg:
cd ffmpeg && ./configure && make -j$(nproc) && make install
nm -D <prefix>/lib/libavfilter.so | grep ff_fill_rgb2yuv_table
3. Create rgb2yuv_test.c calling
ff_fill_rgb2yuv_table(av_csp_luma_coeffs_from_avcsp(cs), m)
for AVCOL_SPC_BT709, BT470BG.
4. Build:
gcc rgb2yuv_test.c -o rgb2yuv_test \
-I<prefix>/include -L<prefix>/lib \
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Extends VideoEncoder to support Intel XPU devices.
Encoder.cpp: extended kStableCUDA device-type checks to also match kStableXPU in both VideoEncoder and MultiStreamEncoder, enabling the hardware encoding path (hw frame context setup, pixel format selection, device registration).
XpuDeviceInterface: implemented setupHardwareFrameContextForEncoding and convertTensorToAVFrameForEncoding. RGB→NV12 conversion is done via a SYCL kernel, or via libswscale as CPU fallback.