Skip to content

Quantized (INT8) release artifacts for the WASM df_create() path #686

@bandidas1

Description

@bandidas1

What

Would it be possible to publish a quantized release artifact (INT8 dynamic, or Q4_0/Q8_0-style) of the DeepFilterNet3 model that drops directly into the existing WASM df_create(modelBytes, atten_lim) path?

Currently the released deepfilter3_model.tar.gz is FP32, ~7.7 MB. On 3g connections the cold load (8.6 MB df_bg.wasm + 7.7 MB model) is ~95 s before processing starts. A quantized model (~2-4 MB) would cut roughly half the cold-load wait, with a small DNSMOS drop.

Why this matters

We're a free, browser-only audio toolset (https://timbrica.com/en/denoise) that adopted DFN3 via the WASM path you ship. Telemetry across ~1700 unique sessions in 3 days post-launch shows the heaviest perception cost is the initial download — not the inference itself, which is already snappy. A quantized variant in the existing WASM runtime would preserve everything (single asset, df_create() API, no DSP rewrite) while dramatically improving the cold-start cohort.

What we considered

  1. Self-quantize and repackage the existing tar.gz — the format includes runtime constants beyond just weights and we couldn't confirm the WASM runtime accepts custom-quantized payloads.
  2. Migrate to onnxruntime-web with our own quantized ONNX — works in principle but requires reimplementing the full STFT(960, Vorbis) + ERB filter bank + iSTFT + mask combination logic in JS (the DSP currently lives inside df_bg.wasm). That's ~3-5 days of focused work + risk of quality regression vs. your reference.
  3. Use the existing FP32 — what we ship today; the cold-start cost is the open issue.

What would be ideal

A deepfilter3_model_q8.tar.gz (or similar) loadable by the same df_create() entry point. Even a 4-5 MB variant would help substantially.

If quantizing while keeping the WASM runtime compatibility is impractical, would you accept a PR that exports a quantized ONNX-bundle suitable for onnxruntime-web (matching the inference logic of df_bg.wasm)?

Happy to share telemetry, contribute a PR, or help test. Thanks for DFN3 — it's the best browser-deployable speech denoiser by a margin.

— Farid (Timbrica)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions