Question

BAMBU extended gtf file?

1

Entering edit mode

Seongwoo Han ▴ 10

@6d55f695

Last seen 22 months ago

United States

Hello there, I want to know what "extended_annotations.gtf is here (https://github.com/GoekeLab/sg-nex-data/blob/master/docs/SG-NEx_Bambu_tutorial.md#running-bambu)," one of BAMBU's main outputs. It sounds like extended_annotations.gtf is a file with the entire reference annotation plus all discovered novel transcripts. This is a size of about 200 Mb. What I am trying to get is something like "transcript_models.gtf" that has just constructed transcripts (both known and novel), so no entire reference annotation. To my knowledge, its size is 90 ~ 100 Mb. Is there a way to gain that filtered gtf through the command line?

I am using cDNA ONT and cDNA PacBio datasets. I am providing the command line I used to convert from .fastq file to bam file below for cDNA ONT in case I missed something.

./minimap2 -t 8 -ax splice /home/seong/R/x86_64-pc-linux-gnu-library/4.1/bambu/extdata/hg38.fa /data/long_read/ENCBS944CBA/ENCFF263YFG.fastq -o /data/long_read/ENCBS944CBA/ENCFF263YFG.sam

samtools view -@ 8 -Sb -o /data/long_read/ENCBS944CBA/ENCFF563QZR.bam /data/long_read/ENCBS944CBA/ENCFF563QZR.sam

One another question that I have is, does BAMBU detect intron retention? Let me know for these questions, thanks a lot!

BAMBU bambu • 1.0k views

ADD COMMENT • link updated 2.3 years ago by Sheetal • 0 • written 2.5 years ago by Seongwoo Han ▴ 10

0

Entering edit mode

hello Seongwoo, Did u able to run bambu successfully? I am facing technical problems so it would be great to get help.

ADD REPLY • link 2.3 years ago Sheetal • 0

score 1 · Answer 1 · 2022-09-28

Hi Seongwoo,

I think I addressed this on the Github Issue, but for the sake of users that might find this issue here.

This line filters the output and you can then write the output as usual.

constructedAnnotations = se[assays(se)$fullLengthCounts > 0]
writeBambuOutput(constructedAnnotations, path = "./YOUR_PATH_HERE/")