Skip to content

Add WAV file I/O support to all audio-processing tools#183

Open
ludomal wants to merge 5 commits into
openitu:devfrom
ludomal:feature/wav-io
Open

Add WAV file I/O support to all audio-processing tools#183
ludomal wants to merge 5 commits into
openitu:devfrom
ludomal:feature/wav-io

Conversation

@ludomal
Copy link
Copy Markdown
Member

@ludomal ludomal commented May 30, 2026

Summary

Introduce a shared wav_io library that adds transparent WAV file support to all 28 audio-processing tools in the STL. Tools can now accept .wav input and produce .wav output, while maintaining full backward compatibility with raw PCM workflows.

This PR:

Motivation

The STL tools currently only handle raw 16-bit PCM files, requiring users to manually convert WAV files before processing and convert back afterward. This is error-prone (wrong sample rate, byte order, bit depth) and creates friction for users working with standard audio formats.

Design

A single shared library (src/utl/wav_io.c, wav_io.h) provides:

  • Auto-detect WAV input via RIFF header magic bytes — no command-line flag needed
  • Output format by extension.wav produces WAV, anything else produces raw PCM
  • Sample rate propagation — WAV output inherits the input file's sample rate
  • Parameter validation — when a tool has a fixed or user-specified sample rate, WAV headers are validated against it
  • Optional -sf/-fs with WAV — if the user omits the sample rate option, the WAV header's rate is used directly; if specified, it's validated against the header
  • Bit depth: the wav_io library supports 8/16/24/32-bit PCM and 32-bit float at the I/O layer. However, since all STL tools operate on 16-bit short buffers, they validate expected_bits = 16 on open — non-16-bit WAV files are rejected with a clear error.
  • Multi-channel handling: extracts channel 1 with a warning (tools expect mono)
  • Correct seeking: audio_seek() accounts for WAV header offset
  • Correct file size: audio_get_data_size() returns PCM data size excluding header
  • Full backward compatibility: existing raw PCM workflows are completely unchanged

API

/* Open for reading — auto-detects WAV vs raw. Validates expected params if > 0. */
AUDIO_FILE *audio_open_read(const char *filename, long expected_rate, int expected_channels, int expected_bits);

/* Open for writing — produces WAV if filename ends in .wav, raw otherwise. */
AUDIO_FILE *audio_open_write(const char *filename, long sample_rate, int channels, int bits_per_sample);

/* Read/write samples */
long audio_read(AUDIO_FILE *af, void *buffer, long nsamples);
long audio_write(AUDIO_FILE *af, void *buffer, long nsamples);

/* Seek relative to start of PCM data (accounts for WAV header) */
int audio_seek(AUDIO_FILE *af, long offset);

/* Close file (updates WAV header with final data size) */
void audio_close(AUDIO_FILE *af);

/* Accessors */
int audio_is_wav(AUDIO_FILE *af);
long audio_get_sample_rate(AUDIO_FILE *af);
int audio_get_channels(AUDIO_FILE *af);
long audio_get_data_size(AUDIO_FILE *af);

Parameter handling with WAV files

Scenario Behavior
WAV input, no -sf/-fs option Accepts any rate; uses WAV header value for processing
WAV input, explicit -sf/-fs matching Works normally
WAV input, explicit -sf/-fs mismatching Error: rate mismatch
Codec tools (g711, g722, g726, etc.) Always validate against codec's intrinsic rate
bs1770demo Always validates 48000 Hz
Raw PCM input No validation possible; uses CLI parameter or tool default
WAV output (.wav extension) Header written with correct sample rate from input/codec
Non-.wav output Raw PCM as before

Integration

Integrated into 29 tools across all categories:

Category Tools
Codecs g711demo, g711iplc, g722demo, encg722, decg722, g726demo, vbr-g726, g727demo, rpedemo
Filters firdemo, filter, cirsdemo, pcmdemo, c712demo
Measurement sv56demo, actlevel, bs1770demo, mnrudemo, calc-snr, esdru, freqresp
Processing reverb, stereoop
Utilities scaldemo, signal-diff, astrip, fdelay, measure, oper, sine

Not integrated

Tool Reason
g728, g728fp Complex endianness swap logic in I/O loops; requires rework

Changes

  • New files: src/utl/wav_io.c, src/utl/wav_io.h, src/utl/test_wav_io.c, src/utl/mkwav.c, src/utl/test_wav_io_validate.c, src/utl/test_wav_integration.cmake
  • Modified: 29 tool source files — replace fopen/fread/fwrite/fseek/stat/rewind with wav_io API calls
  • CMakeLists.txt: Link wav_io.c into each tool's build target; add integration test suite
  • Documentation: LaTeX manual section in doc/manual/utl.tex

Testing

Unit tests (test_wav_io)

  • Write/read roundtrip for 16-bit mono WAV
  • Raw PCM file detection and passthrough
  • Extension-based output format selection
  • Sample rate mismatch rejection
  • Stereo channel extraction

Integration tests (110 CTest cases in test_wav_integration.cmake)

Case 1 — WAV input with matching parameters (25 tools):
g711demo, g726demo, sv56demo, mnrudemo, firdemo, reverb, cirsdemo, c712demo, pcmdemo, filter, scaldemo, stereoop, astrip, fdelay, oper, g711iplc, vbr-g726, g722demo, encg722, bs1770demo, esdru, actlevel, freqresp, g727demo, signal-diff

Case 2 — WAV input with wrong sample rate → rejected (9 tests):
g711 (48k→8k), g726 (16k→8k), sv56 explicit mismatch (×2), vbr-g726 (16k→8k), g722 (8k→16k), bs1770 (8k→48k), g711iplc (16k→8k), g727 (16k→8k)

Case 3 — WAV input without -sf/-fs → accepted (5 tools):
sv56demo (16k WAV, default 16k), filter (16k WAV, default 8k), actlevel (8k WAV, default 16k), esdru (16k WAV, default 48k), freqresp (8k WAV, default 16k)

Case 4 — WAV output with correct header (21 tools):
g711demo, sv56demo, mnrudemo, reverb, decg722, g726demo, vbr-g726, g727demo, g711iplc, filter, firdemo, cirsdemo, pcmdemo, c712demo, scaldemo, stereoop, astrip, fdelay, oper, esdru, bs1770demo, sine

Existing tests

All pre-existing ctest tests continue to pass unchanged (raw PCM path).

Backward Compatibility

Fully backward compatible. When input files lack a RIFF header, tools behave exactly as before. When output filenames don't end in .wav, raw PCM is produced. No existing scripts or workflows are affected.

Ludovic Malfait added 4 commits May 26, 2026 16:14
Introduce wav_io library (src/utl/wav_io.c, wav_io.h) providing
transparent WAV and raw PCM file handling for all STL tools.

Features:
- Auto-detect WAV input via RIFF header magic bytes
- Output format determined by filename extension (.wav = WAV, else raw)
- Supports 8-bit, 16-bit, 24-bit, 32-bit PCM and 32-bit IEEE float
- Multi-channel WAV: extracts channel 1 with warning
- Parameter validation: expected sample rate, channels, and bit depth
- Full backward compatibility: raw PCM workflows unchanged

Integrated into 28 tools across all categories:
- Codecs: g711demo, g711iplc, g722demo, encg722, decg722, g726demo,
  vbr-g726, g727demo, rpedemo
- Filters: firdemo, filter, cirsdemo, pcmdemo, c712demo
- Measurement: sv56demo, actlevel, bs1770demo, mnrudemo, calc-snr,
  esdru, freqresp
- Processing: reverb, stereoop
- Utilities: scaldemo, signal-diff, astrip, fdelay, measure, oper

Includes unit tests (test_wav_io) and LaTeX manual documentation.
- Propagate sample rate from input to output in all 18 tools that
  previously passed 0 to audio_open_write
- Validate WAV sample rate against codec's intrinsic rate (g711=8k,
  g722=16k, g726/g727=8k, bs1770=48k) or user-specified -sf/-fs
- Make -sf/-fs optional with WAV input: if omitted, use WAV header
  rate directly (sv56demo, actlevel, filter, esdru, freqresp)
- Add audio_seek() to fix fseek calls that ignored WAV data_offset
- Add audio_get_data_size() to fix stat()-based size calculations
  that included WAV header bytes
- Add 108 integration tests covering WAV input (matching/mismatching
  parameters), WAV output header validation, and optional -sf behavior
- Add mkwav and test_wav_io_validate test utilities
Allows sine to output .wav files with correct sample rate header.
Includes WAV output test verifying 16000 Hz header.
Note: g728, g728fp, spdemo not integrated due to complexity
(endianness handling, G.192 format conversion).
g726demo operates on log-domain (A-law/mu-law) samples, not linear PCM.
The tests incorrectly fed linear PCM to g726demo causing segfaults.
vbr-g726 WAV tests already cover G.726 WAV integration correctly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

WAV as input format and output format

1 participant