A flexible frequency-band splitter for music source separation, organized around a single separator family that can express BS-style, mel-style, and custom layouts.
Instead of treating BS-style and mel-band layouts as separate Python entry points, this package treats them as different band-layout configurations of one core design centered on BandSplitRotator.
The codebase is implemented in PyTorch, fully typed (py.typed), and designed for modular reuse so options such as PoPE, custom filter banks, and optional SageAttention acceleration can live on one aligned constructor surface.
If loading an older BS-style checkpoint or configuration raises a size-mismatch error, check mask_estimator_depth in the configuration.
Some earlier configuration files effectively used mask_estimator_depth=1 even when stored as 2 because a later subtraction was applied. This package removes that subtraction, so the direct equivalent is:
- set
mask_estimator_depth=1
Updating that value resolves the most common mismatch quickly.
- Forward-looking architecture: A single model family makes it easier to adopt new ideas, such as PoPE, HyperACE, or custom band-split definitions, while keeping interfaces aligned with established ecosystems.
- Universal configuration surface:
BandSplitRotatorexposes downstream attention, transformer, mask-estimator, optional HyperACE segmentation-branch, STFT, and loss settings on one constructor. - Rich tooling and ecosystem: The package provides strong typing (
py.typed), modular APIs, and rich docstrings focusing on usage, literature citations, and migration paths.
Transitioning from older codebases is straightforward because most downstream identifiers stay the same and the data flow is highly similar.
If you're changing from an existing codebase, replace the older import path with BandSplitRotator, then choose the desired band layout with freqs_per_bands, sample_rate together with num_bands, or mask_filter_bank.
The key design idea is that the difference between the BS-style front end and the mel-band front end is treated as a band-layout problem, not as a reason to maintain two unrelated model families.
hunterFormsBS.bandSplitRotator.BandSplitRotatoris the primary unified entry point and the only public separator class exported by this package.hunterFormsBS.bands.BandSplit,hunterFormsBS.mask.MaskEstimator,hunterFormsBS.mask.MLP,hunterFormsBS.loss.lossComputation,hunterFormsBS.attend.Attention, andhunterFormsBS.transform.Transformerhold the reusable typed building blocks shared across all band-layout modes.hunterFormsBS.hyperACE.SegmModelsupplies the optional segmentation-style mask-estimation branch used whenuse_hyperACE=True.- Attention and transformer options such as
attn_dropout,ff_dropout,flash_attn,sage_attention, andscalekeep the same identifiers as they move from model constructors into downstream blocks. - All user-configurable separator settings live on
BandSplitRotator, including optional HyperACE branch settings under thesegm_*prefix.
At the band level, the model only needs a band-membership map, called mask_filter_bank in the
codebase. You can think of that map as a Boolean matrix
where
- In a non-overlapping BS-style layout, each frequency bin belongs to exactly one band, so
- In an overlapping mel-style layout, some frequency bins belong to more than one band, so
When bands overlap, the reconstructed mask for a frequency bin is averaged across the contributing bands:
That is why this package makes it easy to move between overlapping and non-overlapping bands, and to change how bands are distributed across the frequency axis. The architectural difference lives in the filter bank, not in two separate theories of the model.
| Use this configuration | When | Key parameters |
|---|---|---|
BandSplitRotator with the default non-overlapping layout |
You want a BS-style frequency partition or a close comparison with non-overlapping upstream layouts. | freqs_per_bands, optional num_bands |
BandSplitRotator with automatic mel-band construction |
You want a mel-band layout computed at runtime. | sample_rate, num_bands, melscale_fbanks_mel_scale, melscale_fbanks_norm |
BandSplitRotator with explicit custom bands |
You want a checkpoint-specific or research-specific layout. | mask_filter_bank |
flash_attn=True requests PyTorch scaled-dot-product-attention backends when the active device
supports that path. sage_attention=True asks downstream Attend blocks to call
sageattention.sageattn; install SageAttention manually
before enabling it because hunterFormsBS does not install that package.
BandSplitRotator can choose RoPE or PoPE with use_pope, so the same model family can switch
positional encoders without changing entry points.
Most users never need this section. When mask_filter_bank is None and both sample_rate and
num_bands are provided, BandSplitRotator builds mel-band layouts at runtime with
torchaudio.functional.melscale_fbanks. The constructor parameters
melscale_fbanks_mel_scale and melscale_fbanks_norm are forwarded to torchaudio, so you can
choose the mel-scale formula and normalization rule used for automatic mel-band construction.
For the package default mel-band layout, keep sample_rate=44100, stft_n_fft=2048,
num_bands=60, melscale_fbanks_mel_scale='slaney', and melscale_fbanks_norm='slaney'.
When sample_rate and num_bands are not both provided, BandSplitRotator derives the
non-overlapping BS-style Boolean band-membership map from freqs_per_bands.
If a checkpoint uses a different band layout, pass mask_filter_bank explicitly as a Boolean
torch.Tensor with shape (band, freq).
All optional segmentation-branch settings also live on BandSplitRotator under the segm_*
prefix.
hunterFormsBS.__init__- Direct export:
BandSplitRotator - Purpose: small top-level namespace for the primary separator model.
- Direct export:
hunterFormsBS.bandSplitRotator- Main symbols:
BandSplitRotator - Purpose: unified separator that can build BS-style, mel-style, or custom band layouts from one model family, with downstream attention, transformer, STFT, mask-estimator, optional segmentation-branch, and loss options on the constructor.
- Main symbols:
hunterFormsBS.bands- Main symbols:
BandSplit,DEFAULT_FREQS_PER_BANDS - Purpose: band projection layer and the BS-style default frequency-bin partition.
- Main symbols:
hunterFormsBS.loss- Main symbols:
lossComputation - Purpose: multi-resolution STFT training-loss helper.
- Main symbols:
hunterFormsBS.mask- Main symbols:
MaskEstimator,MLP - Purpose: mask-estimation heads and band-local affine blocks.
- Main symbols:
hunterFormsBS.attend- Main symbols:
Attend,Attention - Purpose: shared attention core with RoPE / PoPE, PyTorch SDPA, and optional SageAttention support.
- Main symbols:
hunterFormsBS.transform- Main symbols:
FeedForward,Transformer - Purpose: position-wise feedforward block and stacked attention-and-feedforward transformer.
- Main symbols:
hunterFormsBS.hyperACE- Main symbols:
SegmModel,HyperACE,Backbone,Decoder,ProgressiveUpsampleHead - Purpose: optional segmentation-style mask-estimation branch and its reusable building blocks.
- Main symbols:
hunterFormsBS.theTypes- Main symbols:
ParametersComputeLoss,FlashAttentionConfig,ParametersAttention,ParametersMaskEstimator,ParametersSTFT,ParametersTransformer - Purpose: typed configuration records used across the package.
- Main symbols:
The stable separator path is
raw audio → STFT → band gathering → BandSplit → hierarchical attention → MaskEstimator → mask
followed by overlap-aware mask averaging when needed, complex masking in the STFT domain, and inverse STFT reconstruction back to waveform audio.
When use_hyperACE=True, each MaskEstimator head also adds the optional segmentation-style branch
before the final mask accumulation step.
The top-level package namespace currently re-exports the primary model that new users most often need:
BandSplitRotator
Supporting modules stay under explicit submodule imports so the main namespace remains small and comparisons with upstream repos stay easy to read.
- BibTeX citation. TeX Source with precise formulas for AI agents.
- eprint: arXiv.1606.08415
- Implementations:
- Common name: GLU (Gated Linear Units)
- BibTeX citation. TeX Source with precise formulas for AI agents.
- Proceedings: proceedings.mlr.press
- BibTeX citation. TeX Source with precise formulas for AI agents.
- Proceedings: proceedings.neurips.cc
- Common name: RMSNorm
- BibTeX citation. TeX Source with precise formulas for AI agents.
- Proceedings: proceedings.neurips.cc
- Implementations:
- bzhangGo/rmsnorm
- hunterhogan/torch_einops_kit.scaleValues.RMSNorm
- Common name: RoPE
- BibTeX citation. TeX Source with precise formulas for AI agents.
- DOI: 10.1016/j.neucom.2023.127063
- Free pre-print: arXiv:2104.09864
- Original blog about the idea: https://kexue.fm/archives/8265
- An intuitive explanation: https://blog.eleuther.ai/rotary-embeddings/
- Implementations:
- BibTeX citation. TeX Source with precise formulas for AI agents.
- Proceedings: proceedings.neurips.cc
- BibTeX citation. TeX Source with precise formulas for AI agents.
- Proceedings: proceedings.neurips.cc
- Implementations:
- BibTeX citation. TeX Source with precise formulas for AI agents.
- Proceedings: proceedings.iclr.cc
- Implementation:
SageAttention2: Efficient Attention with Thorough Outlier Smoothing and Per-thread INT4 Quantization
- BibTeX citation. TeX Source with precise formulas for AI agents.
- Proceedings: proceedings.mlr.press
- Implementation:
- BibTeX citation. TeX Source with precise formulas for AI agents.
- Workshop proceedings: OpenReview
- Implementation:
- BibTeX citation.
- Proceedings: ICLR 2022 Conference
- Implementations:
- Common name: BS-RoFormer
- BibTeX citation. TeX Source with precise formulas for AI agents.
- DOI: 10.1109/ICASSP48485.2024.10446843
- Free pre-print: arXiv:2309.02612
- Implementations:
- Common name: MelBand-RoFormer
- BibTeX citation. TeX Source with precise formulas for AI agents.
- Proceedings: 10.5281/zenodo.14877371
- Implementations:
- Common name: TFC-TDF-UNet v3
- BibTeX citation. TeX Source with precise formulas for AI agents.
- eprint: arXiv.2306.09382
- Implementation:
Music Source Separation Based on a Lightweight Deep Learning Framework (DTTNET: DUAL-PATH TFC-TDF UNET)
- Common name: DTTNet
- BibTeX citation. TeX Source with precise formulas for AI agents.
- DOI: 10.1109/ICASSP48485.2024.10448020
- Free pre-print: arXiv:2309.08684
- Implementation:
- Common names: YOLOv13, HyperACE, FullPAD
- BibTeX citation. TeX Source with precise formulas for AI agents.
- eprint: arXiv.2506.17733
- Implementation:
- Common name: PoPE
- BibTeX citation. TeX Source with precise formulas for AI agents.
- OpenReview.net: ICLR 2026 Conference
- eprint: arXiv.2509.10534
- Implementations:
