|
2 | 2 | //! |
3 | 3 | //! # BINSEQ |
4 | 4 | //! |
5 | | -//! The `binseq` library provides efficient APIs for working with the [BINSEQ](https://www.biorxiv.org/content/10.1101/2025.04.08.647863v1) file format family. |
| 5 | +//! The `binseq` library provides efficient APIs for working with the [BINSEQ](https://www.biorxiv.org/content/10.1101/2025.04.08.647863v2) file format family. |
6 | 6 | //! |
7 | 7 | //! It offers methods to read and write BINSEQ files, providing: |
8 | 8 | //! |
9 | 9 | //! - Compact multi-bit encoding and decoding of nucleotide sequences through [`bitnuc`](https://docs.rs/bitnuc/latest/bitnuc/) |
10 | | -//! - Memory-mapped file access for efficient reading ([`bq::MmapReader`] and [`vbq::MmapReader`]) |
11 | | -//! - Parallel processing capabilities for arbitrary tasks through the [`ParallelProcessor`] trait. |
12 | | -//! - Configurable [`Policy`] for handling invalid nucleotides |
13 | 10 | //! - Support for both single and paired-end sequences |
14 | | -//! - Optional sequence headers/identifiers (VBQ format) |
15 | | -//! - Abstract [`BinseqRecord`] trait for representing records from both `.bq` and `.vbq` files. |
16 | | -//! - Abstract [`BinseqReader`] enum for processing records from both `.bq` and `.vbq` files. |
17 | | -//! - Abstract [`BinseqWriter`] enum for writing records to both `.bq`, `.vbq`, and `.cbq` files. |
| 11 | +//! - Abstract [`BinseqRecord`] trait for representing records from all variants |
| 12 | +//! - Abstract [`BinseqReader`] enum for processing records from all variants |
| 13 | +//! - Abstract [`BinseqWriter`] enum for writing records to all variants |
| 14 | +//! - Parallel processing capabilities for arbitrary tasks through the [`ParallelProcessor`] trait. |
| 15 | +//! - Configurable [`Policy`] for handling invalid nucleotides (BQ/VBQ, CBQ natively supports `N` nucleotides) |
| 16 | +//! |
| 17 | +//! ## Recent additions (v0.9.0): |
| 18 | +//! |
| 19 | +//! ### New variant: CBQ |
| 20 | +//! **[`cbq`]** is a new variant of BINSEQ that solves many of the pain points around VBQ. |
| 21 | +//! The CBQ format is a columnar-block-based format that offers improved compression and faster processing speeds compared to VBQ. |
| 22 | +//! It natively supports `N` nucleotides and avoids the need for additional 4-bit encoding. |
| 23 | +//! |
| 24 | +//! ### Improved interface for writing records |
| 25 | +//! **[`BinseqWriter`]** provides a unified interface for writing records generically to BINSEQ files. |
| 26 | +//! This makes use of the new [`SequencingRecord`] which provides a cleaner builder API for writing records to BINSEQ files. |
18 | 27 | //! |
19 | 28 | //! ## Recent VBQ Format Changes (v0.7.0+) |
20 | 29 | //! |
|
28 | 37 | //! |
29 | 38 | //! Legacy VBQ files are automatically migrated to the new format when accessed. |
30 | 39 | //! |
31 | | -//! ## Crate Organization |
32 | | -//! |
33 | | -//! This library is split into 3 major parts. |
34 | | -//! |
35 | | -//! There are the [`bq`] and [`vbq`] modules, which provide tools for reading and writing `BQ` and `VBQ` files respectively. |
36 | | -//! Then there are traits and utilities that are ubiquitous across the library which are available at the top-level of the crate. |
37 | | -//! |
38 | 40 | //! # Example: Memory-mapped Access |
39 | 41 | //! |
40 | 42 | //! ``` |
|
0 commit comments