Skip to content

Latest commit

 

History

History
113 lines (74 loc) · 3.98 KB

File metadata and controls

113 lines (74 loc) · 3.98 KB

immReferent

An Interface for Immune Receptor and HLA Gene Reference Data

R-CMD-check Codecov test coverage Bioc Devel Build

Introduction

immReferent is an R package designed to provide a stable, reproducible, and lightweight interface to IMGT immune receptor (TCR/BCR) and HLA sequences and the AIRR-C's OGRDB. It serves as the backbone for computational immunology workflows by ensuring a consistent source of high-quality nucleotide and protein sequences.

Interactions

Please read more on these amazing resources and cite them in your work!

This package enables:

  • Downloading IMGT/OGRDB data (receptor and HLA sequences).
  • Caching for offline reproducibility.
  • Querying metadata (allele, gene, species).
  • Interoperability with Bioconductor objects (DNAStringSet, AAStringSet).

Why immReferent?

As immune repertoire analysis expands, a centralized sequence reference layer is critical for:

  • Integrating across tools like scRepertoire and immApex
  • Reducing redundancy by avoiding repeated hard-coded IMGT/OGRDB downloads.
  • Guaranteeing reproducibility with cached versions.

Installation

devtools::install_github("BorchLab/immReferent")

Or via Bioconductor (once accepted)

if (!require("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

BiocManager::install("immReferent")

Getting Started

IMGT usage

IMGT is used as a reference for gene names and sequence information can be accessed via getIMGT(). Data from IMGT is under a CC BY-NC-ND 4.0 license. Please be aware that attribution is required for usage and it is the intent of IMGT to not allow derivative or commercial usage.

library(immReferent)

# Check if IMGT is online
is_imgt_available()

# Download human IGHV sequences
ighv <- getIMGT("IGHV")

# Load cached data
cached <- loadIMGT("IGHV")

# List available datasets
listIMGT()

# Refresh cache
refreshIMGT("IGHV")

OGRDB usage

library(immReferent)

# Check if OGRDB is online
is_ogrdb_available()

# Download human IGHV sequences
igk_airr <- getOGRDB(species = "human", locus = "IGK", type = "NUC", format = "AIRR")

# Load cached data
cached <- loadIMGT(species = "human", locus = "IGK", type = "NUC", format = "AIRR")

# List available datasets
listOGRDB()

# Refresh cache
refreshOGRB(species = "human", locus = "IGK", type = "NUC", format = "AIRR")

Bug Reports/New Features

If you run into any issues or bugs please submit a GitHub issue with details of the issue.

If possible please include a reproducible example.

Any requests for new features or enhancements can also be submitted as GitHub issues.

Pull Requests are welcome for bug fixes, new features, or enhancements.

Citation

In Progress