RSamtools idxstats does not return unmapped reads
1
1
Entering edit mode
mja18 ▴ 40
@mja18-23865
Last seen 3.7 years ago
United States

Noticed that RSamtools::idxstatsBam() is not returning unmapped reads. When using bash samtools idxstats the last line reported is always a reference name of '*' and that line lists all the unmapped reads. That last line with reference name '*' does not show up when using RSamtools::idxstatsBam().

Reproducible example below uses a remote file, but same behavior is present if local bamfiles are used.


library(Rsamtools)
idxstatsBam("http://plantsmallrnagenes.science.psu.edu/bol-b1.0/alignments/SRR799356_3Q.bam")
  seqnames seqlength  mapped unmapped
1      C01  38761720 1091379        0
2      C02  44046003 1209217        0
3      C03  57781463 1763962        0
4      C04  40895475 2461661        0
5      C05  32828328 1657834        0
6      C06  40704471 1388970        0
7      C07  48346208 4523078        0
8      C08  41516064 3736635        0
9      C09  40126856 1518657        0

Compare to this:

# in bash
samtools idxstats http://plantsmallrnagenes.science.psu.edu/bol-b1.0/alignments/SRR799356_3Q.bam
C01 38761720    1091379 0
C02 44046003    1209217 0
C03 57781463    1763962 0
C04 40895475    2461661 0
C05 32828328    1657834 0
C06 40704471    1388970 0
C07 48346208    4523078 0
C08 41516064    3736635 0
C09 40126856    1518657 0
*   0   0   5517000
sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.7

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] Rsamtools_2.4.0      Biostrings_2.56.0    XVector_0.28.0       GenomicRanges_1.40.0 GenomeInfoDb_1.24.2 
[6] IRanges_2.22.2       S4Vectors_0.26.1     BiocGenerics_0.34.0 

loaded via a namespace (and not attached):
 [1] rstudioapi_0.13        magrittr_2.0.1         knitr_1.31             zlibbioc_1.34.0       
 [5] BiocParallel_1.22.0    R6_2.5.0               rlang_0.4.10           stringr_1.4.0         
 [9] tools_4.0.2            xfun_0.22              tinytex_0.30           jquerylib_0.1.3       
[13] htmltools_0.5.1.1      yaml_2.2.1             digest_0.6.27          crayon_1.4.1          
[17] GenomeInfoDbData_1.2.3 sass_0.3.1             bitops_1.0-6           RCurl_1.98-1.3        
[21] evaluate_0.14          rmarkdown_2.7          stringi_1.5.3          compiler_4.0.2        
[25] bslib_0.2.4            jsonlite_1.7.2
Rsamtools • 1.2k views
ADD COMMENT
1
Entering edit mode
@martin-morgan-1513
Last seen 5 months ago
United States

This feature is available in Rsamtools 2.7.2. This requires use of R-4.1 / Bioconductor 3.13, and will be available in the next couple of days. Thanks for the request!

ADD COMMENT
0
Entering edit mode

Great, thanks Martin !

ADD REPLY

Login before adding your answer.

Traffic: 568 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6