Skip to content

Move pass-1 SJ tab from <prefix>SJ.pass1.out.tab to <prefix>_STARpass1/SJ.out.tab #28

@pinin4fjords

Description

@pinin4fjords

Summary

rustar's two-pass mode writes the pass-1 splice junction tab as <prefix>SJ.pass1.out.tab at the top level. STAR keeps two-pass intermediates inside <prefix>_STARpass1/ and names the pass-1 SJ tab SJ.out.tab inside that directory.

STAR reference behaviour

The two-pass orchestration in source/twoPass.cpp mkdirs <prefix>_STARpass1/ and redirects pass-1 output into it. The pass-1 splice tab lives at <prefix>_STARpass1/SJ.out.tab alongside a pass-1 Log.final.out.

Reproducer

#!/usr/bin/env bash
set -euo pipefail
mkdir -p /tmp/rustar-mre-28 && cd /tmp/rustar-mre-28

BASE=https://raw.githubusercontent.com/nf-core/test-datasets/626c8fab639062eade4b10747e919341cbf9b41a
curl -fsLO $BASE/reference/genome.fasta
curl -fsL  $BASE/reference/genes_with_empty_tid.gtf.gz | gunzip -c > genes.gtf
curl -fsLO $BASE/testdata/GSE110004/SRR6357072_1.fastq.gz
curl -fsLO $BASE/testdata/GSE110004/SRR6357072_2.fastq.gz

RUSTAR=ghcr.io/scverse/rustar-aligner:dev
STAR=community.wave.seqera.io/library/htslib_samtools_star_gawk:ae438e9a604351a4

mkdir -p idx-rustar idx-star
docker run --rm -v $PWD:/w -w /w $RUSTAR rustar-aligner --runMode genomeGenerate \
    --genomeDir idx-rustar --genomeFastaFiles genome.fasta --sjdbGTFfile genes.gtf \
    --sjdbOverhang 100 --genomeSAindexNbases 7
docker run --rm -v $PWD:/w -w /w $STAR STAR --runMode genomeGenerate \
    --genomeDir idx-star --genomeFastaFiles genome.fasta --sjdbGTFfile genes.gtf \
    --sjdbOverhang 100 --genomeSAindexNbases 7

COMMON=(--readFilesIn SRR6357072_1.fastq.gz SRR6357072_2.fastq.gz --readFilesCommand zcat
        --runThreadN 4 --sjdbGTFfile genes.gtf --twopassMode Basic --runRNGseed 0
        --outSAMtype BAM Unsorted)

docker run --rm -v $PWD:/w -w /w $RUSTAR rustar-aligner \
    --genomeDir idx-rustar "${COMMON[@]}" --outFileNamePrefix RUS.
docker run --rm -v $PWD:/w -w /w $STAR STAR \
    --genomeDir idx-star "${COMMON[@]}" --outFileNamePrefix STAR.

echo "=== STAR pass-1 layout ==="; ls STAR._STARpass1/
echo "=== rustar layout ==="; ls RUS./SJ.pass1.out.tab RUS./_STARpass1 2>&1

Observed: STAR produces STAR._STARpass1/SJ.out.tab. rustar produces RUS.SJ.pass1.out.tab at the top level (or inside RUS./ per #26), with no _STARpass1 directory.

Suggested fix

Move the pass-1 SJ tab into <prefix>_STARpass1/SJ.out.tab (mkdir the parent first). File content unchanged.

A related follow-up — Log.out and Log.progress.out are also missing — is split out as #55.

Severity

Low. Output-shape compatibility cleanup. nf-core/rnaseq works around it with a permissive *.tab glob today.


Filed during nf-core/rnaseq integration testing (nf-core/rnaseq#1855).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions