Skip to content

1.1.0

Latest

Choose a tag to compare

@Si13n7 Si13n7 released this 18 Mar 18:49
· 4 commits to master since this release

Performance

All binary-to-text encodings have been completely rewritten for maximum throughput. Every encoding now uses Parallel.For across all available CPU cores with double-buffered I/O — the write of one batch overlaps with the encode of the next. ArrayPool<byte> eliminates per-batch heap allocations throughout.

Encoding Before After Improvement
Base-2 30 MiB/s 2.2 GiB/s +7100%
Base-8 40 MiB/s 1.0 GiB/s +2355%
Base-10 44 MiB/s 1.2 GiB/s +2485%
Base-16 45 MiB/s 7.5 GiB/s +16200%
Base-32 379 MiB/s 1.3 GiB/s +235%
Base-64 481 MiB/s 9.6 GiB/s +1900%
Base-85 167 MiB/s 2.8 GiB/s +1590%
Base-91 339 MiB/s 379 MiB/s +12%

Benchmarked on AMD Ryzen 5 7600 (AVX2 + AVX-512), 32 GB DDR5, Manjaro Linux, .NET 10, Release build.

Hardware acceleration paths added per encoding:

  • Base-2: AVX2 (32 bytes/iter) and AVX-512 (64 bytes/iter). Full SIMD bit-expansion via a 3-stage UnpackLow/High interleave chain with Permute2x128 reassembly and ExtractVector256 for the AVX-512 path.
  • Base-8: AVX2 and AVX-512. Pure bit-shift extraction (b>>6, (b>>3)&7, b&7) with SIMD digit-to-ASCII mapping via Add.
  • Base-10: AVX2 and AVX-512. Magic-number multiply-shift (MultiplyHigh) replaces integer division for hundreds/tens/ones digit extraction across all bytes simultaneously.
  • Base-16: AVX2 and AVX-512. Nibble extraction + vpshufb LUT lookup + UnpackLow/High interleave + Permute2x128 reassembly. The most SIMD-friendly encoding after Base-64.
  • Base-32: AVX2 vpshufb alphabet lookup with Permute2x128 for lane-correct output. The non-power-of-two 5-bit group width prevents full SIMD vectorization of the bit-extraction phase.
  • Base-64: Delegates to .NET's built-in System.Buffers.Text.Base64 (internally AVX2-accelerated). Parallelization and double-buffered I/O layered on top.
  • Base-85: AVX2. Magic-number division (floor(x/85) via multiply-shift) for all 8 groups simultaneously, PackUnsignedSaturate + vpshufb transpose for 5-digit output, two-table lookup decode replacing 5 multiplies with 2 table lookups + 2 additions.
  • Base-91: No SIMD or parallelization possible due to the serial bit-accumulator dependency chain. Minor gains from chunk-based I/O and a reverse lookup table replacing Contains + IndexOf.

Added

  • Full support for .NET 10.0
  • BinaryToTextEncoding.ReadBufferprotected helper method for filling a buffer from a stream, available to all encoding subclasses.
  • Base32: Fixed incorrect encoded output for certain inputs. The previous bit-manipulation loop produced wrong values at group boundaries. Replaced with an explicit stateless 5-bit extraction (EncodeGroup) that is mathematically correct per RFC 4648.
  • Base32: Partial block encoding now correctly follows RFC 4648 — ceil(rem*8/5) chars followed by = padding to complete the 8-char group.
  • Base32: Decode now validates partial block lengths — lengths 1, 3, and 6 are invalid per RFC 4648 and throw DecoderFallbackException.

Changed

  • Helper.GetBufferSize now wraps stream.Length in a try/catch to handle streams that do not support seeking (e.g. NetworkStream, CryptoStream) without throwing NotSupportedException.
  • Helper.GetBufferSize buffer thresholds increased to better match the large chunk sizes used by the optimized encoders.
  • BinaryToTextExtensions.GetDefaultInstance simplified — redundant while loops replaced with a single Interlocked.CompareExchange per level, and Enum.GetValues updated to the generic overload to avoid boxing.
  • TextConvert.Rot13(string) now uses string.Create to write directly into the result string, eliminating the intermediate ToArray allocation.
  • TextConvert.FormatSeparators(string) now uses MemoryStream.GetBuffer instead of ToArray to avoid an unnecessary copy on return.
  • WriteLine overloads removed from BinaryToTextEncoding base class. The remaining overload used exclusively by Base91 has been moved directly into that class.

Removed

  • Support for older .NET versions that have reached or will soon reach EOL

Documentation

  • Performance <remarks> added to all encoding classes with SIMD paths used, throughput relative to other encodings, and — for Base-91 — an explanation of why parallelization is impossible without breaking compatibility.