Skip to content

no transcript_ids produced in the gtf and the transcript tables #32

Description

@bhaggal

Dear IsoTools Team,

I have noticed that the tool does not appear to output specific transcript IDs when a single gene has multiple known isoforms. Because the resulting expression table only displays values aggregated at the gene level, it is difficult to determine which specific isoform is being expressed.

Initially, I suspected this might be an issue unique to my dataset or potentially caused by a misconfiguration in my annotation GTF file. However, after testing the provided demo_data using the standard import workflow below, I encountered the exact same behavior:

# integrate the samples
for i,row in samples.iterrows():
    # this step takes about 5-30 seconds per sample
     isoseq.add_sample_from_bam(row.file_name, sample_name=row.sample_name, group=row.group)
# the sample table of the transcriptome object contains the number of imported reads
isoseq.sample_table

Given this layout, what is the recommended way to retrieve isoform/transcript-level quantification? Furthermore, would a fix for this require adjusting the pipeline downstream, or should transcript-level assignments be handled prior to the alignment step?

Thank you for your time and help!

Best regards,

Amandeep

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions