Skip to content

Replace deprecated pytabix with pysam.TabixFile #15

@FedericaBrando

Description

@FedericaBrando

Problem:
Oncodrivefml relies on pytabix, which is a deprecated and unmaintained package. This library is known to have significant issues, including incorrect indexing and position retrieval when querying genomic regions. These problems can lead to unreliable data handling and difficult-to-debug errors.

Proposed Solution:
I propose we replace all usage of pytabix with the pysam.TabixFile module. pysam is a well-maintained, robust library that provides a more reliable and efficient interface for handling tabix-indexed files.

Rationale:

  • Stability: pysam is actively developed and is the standard for handling SAM/BAM/VCF/BCF/tabix files in the Python bioinformatics ecosystem.
  • Correctness: It resolves the known indexing and region querying bugs present in pytabix.
  • Precedent: This migration has been successfully implemented in other bioinformatics packages, such as boostdm and intogen-core, demonstrating its viability and benefits.

This change will improve the long-term stability and correctness of our data processing pipelines.

[source]

Metadata

Metadata

Labels

bugSomething isn't workingenhancementNew feature or request

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions