Hi folks,
I want to use countBam
from Rsamtools
to count the number of lines in a bam file matching certain criteria - mapping quality and not having the "XA" tag. On the command line, I would do this as follows:
samtools view -q 3 my_file_sort.bam | grep -v "XA:" | wc -l
However, I don't think there is a way to exclude tags using Rsamtools
, since the help for ScanBamParam
says:
tagFilter
A named list of atomic vectors. The name of each list element is the tag name (two-letter code), and the corresponding atomic vector is the set of acceptable values for the tag. Only reads with specified tags are included. NULLs, NAs, and empty strings are not allowed in the atomic vectors.
Of course I can count the reads on the command line using samtools, or count all reads passing the MAPQ filter and then all reads passing the MAPQ filter with the "XA" tag, and then calculate the difference... but is there a better way to do this in R?