Hello,
somehow the scanBam function does not properly read in the mapq scores from this file:
HWI-ST227:407:C56MLACXX:1:1316:6877:64103 0 chr10 3153911 255 24M * 0 0 CCCAATTAATGGGGGCAGGATAAG HFHGHI@DAEI@GGIDEBGFGHHG NH:i:1 HI:i:1 AS:i:19 nM:i:2
HWI-ST227:407:C56MLACXX:1:2102:2752:79037 0 chr10 3153911 255 24M * 0 0 CCCAATTAATGGGGGCAGGATAAG HGIJJIIJJIGHIJJJIJJIJJIG NH:i:1 HI:i:1 AS:i:19 nM:i:2
HWI-ST227:407:C56MLACXX:1:2113:3448:85785 0 chr10 3175788 255 24M * 0 0 TCTTTTAGATCTATTTACTCTGGA ######################## NH:i:1 HI:i:1 AS:i:19 nM:i:2
HWI-ST227:407:C56MLACXX:1:1203:13148:73589 16 chr10 3180818 255 24M * 0 0 TAGGCTTCAAGGAGAGTCTCCTGT ?DBEIIEDD?99>C@DEEE9FFC4 NH:i:1 HI:i:1 AS:i:23 nM:i:0
They all should be 255 but are read in as NA.
The code I used was just:
scanBam("test.bam")[[1]]$mapq
[1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
[34] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
What could be the reason for this?
Best regards,
Felix
PS: How could I provide the test.bam file for testing?
Martin I see your point, however the value 255 for MAPQ is often used (in particular by the aligner topHat and STAR) for uniquely mapped reads.
This may not be a good practice ("No alignments should be assigned mapping quality 255" (SAM spec, "2 Recommended practice for the SAM format"), but it is used, maybe by default scanBAM could allow 255?
I'll likely change the behavior, unless I hear arguments against...
Should be updated in Rsamtools 1.19.42, in svn now and hopefully via biocLite() after Sunday at 10am ish