Quantized (INT8) release artifacts for the WASM df_create() path

## What

Would it be possible to publish a quantized release artifact (INT8 dynamic, or `Q4_0`/`Q8_0`-style) of the DeepFilterNet3 model that drops directly into the existing WASM `df_create(modelBytes, atten_lim)` path?

Currently the released `deepfilter3_model.tar.gz` is FP32, ~7.7 MB. On 3g connections the cold load (8.6 MB `df_bg.wasm` + 7.7 MB model) is ~95 s before processing starts. A quantized model (~2-4 MB) would cut roughly half the cold-load wait, with a small DNSMOS drop.

## Why this matters

We're a free, browser-only audio toolset (https://timbrica.com/en/denoise) that adopted DFN3 via the WASM path you ship. Telemetry across ~1700 unique sessions in 3 days post-launch shows the heaviest perception cost is the initial download — not the inference itself, which is already snappy. A quantized variant in the existing WASM runtime would preserve everything (single asset, `df_create()` API, no DSP rewrite) while dramatically improving the cold-start cohort.

## What we considered

1. **Self-quantize and repackage** the existing tar.gz — the format includes runtime constants beyond just weights and we couldn't confirm the WASM runtime accepts custom-quantized payloads.
2. **Migrate to onnxruntime-web** with our own quantized ONNX — works in principle but requires reimplementing the full STFT(960, Vorbis) + ERB filter bank + iSTFT + mask combination logic in JS (the DSP currently lives inside `df_bg.wasm`). That's ~3-5 days of focused work + risk of quality regression vs. your reference.
3. **Use the existing FP32** — what we ship today; the cold-start cost is the open issue.

## What would be ideal

A `deepfilter3_model_q8.tar.gz` (or similar) loadable by the same `df_create()` entry point. Even a 4-5 MB variant would help substantially.

If quantizing while keeping the WASM runtime compatibility is impractical, would you accept a PR that exports a quantized ONNX-bundle suitable for onnxruntime-web (matching the inference logic of `df_bg.wasm`)?

Happy to share telemetry, contribute a PR, or help test. Thanks for DFN3 — it's the best browser-deployable speech denoiser by a margin.

— Farid (Timbrica)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quantized (INT8) release artifacts for the WASM df_create() path #686

What

Why this matters

What we considered

What would be ideal

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Quantized (INT8) release artifacts for the WASM df_create() path #686

Description

What

Why this matters

What we considered

What would be ideal

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions