Add benchmark for WavDecoder by NicolasHug · Pull Request #1474 · meta-pytorch/torchcodec

NicolasHug · 2026-06-01T16:40:45Z

Adds benchmark of WavDecoder against AudioDecoder against soundfile (float) against soundfile (native == dtype that is closest to wav source, usually without conversion, but not always, e.g. u8 wav still has to be converted to int16). Both WavDecoder and AudioDecoder always convert to float32.

Results on my machine are the following. Unsurprisingly the WavDecoder is much faster than AudioDecoder. We're overall faster than soundfile(float) but I would take the results against soundfile with a grain of salt. I observed vastly different perf depending on the libsoundfile.so that gets resolved, and most long file benhmarks show a modest improvement anyway. Unsurprisingly soundfile tends to be faster in the 'native' scenario since it does less: it doesn't convert to float32, like we do. The one bit that I still cannot explain despite some investigation is the wav float32 decoding time:

float32    5min              18.95          37.02              5.71                 5.71            1.95x               0.30x               0.30x

How can we take 18ms and soundfile take 5? That makes no sense to me, we literally just copy the entire data into the output tensor in one shot. The cost of a memcpy for that size is in the range of 18ms, so there's no logical explanation for libsoundfile perf other than some sort of caching. In cache? via memmap?? via some smart numpy mechanism??? I have no idea, I couldn't figure it out.


===========================================================================================================================================================
SUMMARY
===========================================================================================================================================================
format     duration    WavDec (ms)  AudioDec (ms)  sndfile f32 (ms)  sndfile native (ms)  AudioDec/WavDec  sndfile f32/WavDec  sndfile nat/WavDec
-----------------------------------------------------------------------------------------------------------------------------------------------------------
u8         10s                0.09           0.86              0.60                 0.49            9.11x               6.36x               5.23x
u8         5min              16.67          35.90             21.06                13.53            2.15x               1.26x               0.81x
s16        10s                0.10           8.12              0.77                 0.09           81.97x               7.79x               0.89x
s16        5min              16.51          40.65             26.26                 1.33            2.46x               1.59x               0.08x
s24        10s                0.39           1.01              1.57                 1.36            2.57x               4.01x               3.46x
s24        5min              26.54          41.75             49.99                43.44            1.57x               1.88x               1.64x
s32        10s                0.11           0.80              0.76                 0.13            7.11x               6.74x               1.15x
s32        5min              17.91          35.64             26.21                 5.75            1.99x               1.46x               0.32x
float32    10s                0.10           0.83              0.13                 0.12            8.14x               1.23x               1.22x
float32    5min              18.95          37.02              5.71                 5.71            1.95x               0.30x               0.30x
float64    10s                0.21           1.07              0.80                 0.21            5.11x               3.85x               1.00x
float64    5min              21.86          45.27             29.77                12.03            2.07x               1.36x               0.55x

Also benchmarked different input types just for WavDecoder:


====================================================================================================
FILE vs FILE-LIKE vs BYTES
====================================================================================================
format     duration      file (ms)   file-like (ms)   bytes (ms)   flike/file   bytes/file
----------------------------------------------------------------------------------------------------
u8         10s                0.13             0.18         0.09        1.40x        0.70x
u8         5min              18.27            19.69        16.59        1.08x        0.91x
s16        10s                0.15             0.25         0.09        1.63x        0.60x
s16        5min              19.50            22.68        16.74        1.16x        0.86x
s24        10s                0.49             0.62         0.39        1.27x        0.80x
s24        5min              30.48            34.64        26.30        1.14x        0.86x
s32        10s                0.22             0.41         0.11        1.84x        0.50x
s32        5min              23.16            29.40        17.74        1.27x        0.77x
float32    10s                0.15             0.21         0.10        1.46x        0.66x
float32    5min              18.42            37.25        18.93        2.02x        1.03x
float64    10s                0.41             0.82         0.20        2.01x        0.48x
float64    5min              32.69            45.01        22.57        1.38x        0.69x

There is unsurprisingly some overhead with the file-like object, and reading from bytes is faster as it bypasses the io part. This is consistent with expectations.

pytorch-bot · 2026-06-01T16:40:49Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/meta-pytorch/torchcodec/1474

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

NicolasHug added 2 commits June 1, 2026 17:30

Add benchmark for WavDecoder

61659f7

cleanup

5cc8f3d

meta-cla Bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jun 1, 2026

NicolasHug merged commit 19f1202 into meta-pytorch:main Jun 1, 2026
26 checks passed

NicolasHug deleted the wav_benchmark branch June 1, 2026 16:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add benchmark for WavDecoder#1474

Add benchmark for WavDecoder#1474
NicolasHug merged 2 commits into
meta-pytorch:mainfrom
NicolasHug:wav_benchmark

NicolasHug commented Jun 1, 2026

Uh oh!

pytorch-bot Bot commented Jun 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

NicolasHug commented Jun 1, 2026

Uh oh!

pytorch-bot Bot commented Jun 1, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/meta-pytorch/torchcodec/1474

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant