extract chromosomes and read-starts from a BAM file to a data.frame?
1
0
Entering edit mode
Vang Le ▴ 80
@vang-le-6690
Last seen 4.6 years ago
Denmark

like the title says, I am looking for a concise and high-performance way to extract only the chromosome name and read start position from a BAM file. It can be easily done from outside R like this:

samtools view my.bam |cut -f 3,4

but I want to try it within R code. Calling command via `system` is OK.

 


 

bam rsamtools samtools genomicalignments • 4.5k views
ADD COMMENT
2
Entering edit mode
@martin-morgan-1513
Last seen 3 months ago
United States

Use Rsamtools, specifying a ScanBamParam() with just the information you'd like to extract. Coerce the result to a data.frame (it's just a list anyway, so this is inexpensive).

> library(Rsamtools)
> fl = system.file(package="Rsamtools", "extdata", "ex1.bam")
> p = ScanBamParam(what=c("rname", "pos"))
> head(as.data.frame(scanBam(fl, param=p)))
  rname pos
1  seq1   1
2  seq1   3
3  seq1   5
4  seq1   6
5  seq1   9
6  seq1  13

The help pages ?ScanBamParam and ?scanBam and the package vignette browseVignettes("Rsamtools") as well as the package GenomicAlignments are likely to be helpful.

ADD COMMENT

Login before adding your answer.

Traffic: 667 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6