Hi,
A colleague approached me in hopes of figuring out a strange export problem they were having with rtracklayer. For some reason it seems that when I have over 255-256 chromosomes (manual tested this) in a genome, that certain chromosome(s) are lost during the export of a bigwig.
This was noticed by said colleague as they loaded a rtracklayer generated bigWig into IGV and the X chromosome was missing.
Here is the code being run:
library(GenomicAlignments)
library(rtracklayer)
gr <- readGAlignments('Downloads/A250_S1_001_subset.bam')
seqlevelsStyle(gr) <- 'UCSC'
gg <- coverage( gr )
gg$chrX
export.bw(gg,'test40.bw')
imported.bw <- import('test40.bw')
coverage( imported.bw )$chrX
Here is the output:
> library(GenomicAlignments)
> library(rtracklayer)
> gr <- readGAlignments('Downloads/A250_Wills_S1_001_subset.bam')
> seqlevelsStyle(gr) <- 'UCSC'
> gg <- coverage( gr )
> gg$chrX
integer-Rle of length 123869142 with 177 runs
Lengths: 488442 74 8 75 499789 1 ... 540531 74 172 73 1743866
Values : 0 1 0 1 0 1 ... 0 1 0 1 0
> export.bw(gg,'test40.bw')
> imported.bw <- import('test40.bw')
> coverage( imported.bw )$chrX
integer-Rle of length 123869142 with 1 run
Lengths: 123869142
Values : 0
As you can see somehow the X chromosome has lost al of its coverage information? However it can be noted that if we subset the coverage object (gg) to a list of coverages with less than about 255 chromosomes then the X chromosome coverage returns in the bigwig generated during the "output.bw" step. Currently, we have a 2422 chromosomes (based on the seqinfo call) with no seqlengths provided. As a side note, other chromosomes seem to keep their correct coverage values in the bigwig.
Thanks,
Ben
R version 3.3.1 (2016-06-21)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.9.5 (Mavericks)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] rtracklayer_1.32.2 GenomicAlignments_1.8.4 Rsamtools_1.24.0 Biostrings_2.40.2 XVector_0.12.1
[6] SummarizedExperiment_1.2.3 Biobase_2.32.0 GenomicRanges_1.24.3 GenomeInfoDb_1.8.7 IRanges_2.6.1
[11] S4Vectors_0.10.3 BiocGenerics_0.18.0
loaded via a namespace (and not attached):
[1] XML_3.98-1.4 bitops_1.0-6 zlibbioc_1.18.0 BiocParallel_1.6.6 tools_3.3.1 RCurl_1.95-4.8
Is there a way to download version 1.35.1? and or a work around hack to get this to work under current framework? I'm having a similar problem that is affecting pipelines but the latest update of rtracklayer is 1.34.0
It will be in 1.34.1