Hi, I am trying to use Rsamtools to parse bam file from STAR aligner. STAR outputs a few extra columns than standard SAM format. But Rsamtools seem to follow the standard SAM format. For example:
> library(Rsamtools)
> bam_file1="~/yeast_77_78/yeast_77_78.bam"
> bf = BamFile(bam_file, asMates = TRUE, qnameSuffixStart = ".")
> param = ScanBamParam(flag=scanBamFlag(isPaired=TRUE),
what=scanBamWhat(),which=which)
> bam <- scanBam(bam_file, param=param)
> names(bam$`chrI:1-230218`)
[1] "qname" "flag" "rname" "strand" "pos" "qwidth" "mapq" "cigar" "mrnm" "mpos"
[11] "isize" "seq" "qual"
sessionInfo( )
However STAR alignment outputs three extra columns, which are useful. The NH:i:3
column indicates number of hits and HI:i:2
indicates the index of current hits etc. But Rsamtools does not recognize these extra columns. These columns are sometimes useful. Is there any way to include these columns by specifying ScanBamParam
? I can certainly export bam to sam file, but it takes disk space and it would be ideal such columns can be directly specified or included when executing scanBam command. Thanks for help.
NH:i:3 HI:i:2 AS:i:46 nM:i:0
Thanks so much! That's exactly the solution I am looking for.