rtracklayer import.bw on gz file makes R crash
1
0
Entering edit mode
Janet Young ▴ 740
@janet-young-2360
Last seen 5.1 years ago
Fred Hutchinson Cancer Research Center,…
Hi there, By trying to do something that's not supposed to be done, I accidentally found a fatal R-killing issue with import.bw: I tried using import.bw on a bigWig file while it was still gzipped. You never know, sometimes these things work - I've since found an email thread that might help me avoid unpacking the file ( https://stat.ethz.ch/pipermail/bioconductor/2011-April/038734.html ) but in the meantime I thought I should tell you about how I made R die. Code to reproduce this, details and sessionInfo are all below. thanks, Janmet ######## library(rtracklayer) test.bw.file <- system.file("tests", "test.bw", package = "rtracklayer") ## import works, as it's supposed to bw <- import(test.bw.file, ranges = GenomicRanges::GRanges("chr19", IRanges(1, 6e7))) ### I copied that test.bw file to a local dir and used gzip to compress it test.bw.file.gz <- "temp/test.bw.gz" ### import gives error if I try to read in the gz file directly - that's OK bw2 <- import(test.bw.file.gz, ranges = GenomicRanges::GRanges("chr19", IRanges(1, 6e7))) ## Error in .importForFormat(format) : No import function for 'gz' found ## import.bw also works as its supposed to on the uncompressed file bw3 <- import.bw(test.bw.file, ranges = GenomicRanges::GRanges("chr19", IRanges(1, 6e7))) ## here's sessionInfo output, before I kill off R: sessionInfo() # R version 2.13.1 (2011-07-08) # Platform: i386-apple-darwin9.8.0/i386 (32-bit) # # locale: # [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 # # attached base packages: # [1] stats graphics grDevices utils datasets methods base # # other attached packages: # [1] rtracklayer_1.12.4 RCurl_1.6-7 bitops_1.0-4.1 # # loaded via a namespace (and not attached): # [1] Biostrings_2.20.2 BSgenome_1.20.0 GenomicRanges_1.4.6 IRanges_1.10.5 # [5] XML_3.4-2 ##### now, using import.bw on the gz file makes R crash with no error report on Mac bw4 <- import.bw(test.bw.file.gz, ranges = GenomicRanges::GRanges("chr19", IRanges(1, 6e7))) #### also see the same problem on linux, where my sessionInfo is as follows: # R version 2.13.1 (2011-07-08) # Platform: x86_64-unknown-linux-gnu (64-bit) # # locale: # [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C # [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 # [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 # [7] LC_PAPER=en_US.UTF-8 LC_NAME=C # [9] LC_ADDRESS=C LC_TELEPHONE=C # [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C # # attached base packages: # [1] stats graphics grDevices utils datasets methods base # # other attached packages: # [1] rtracklayer_1.12.4 RCurl_1.6-5 bitops_1.0-4.1 # # loaded via a namespace (and not attached): # [1] Biostrings_2.20.2 BSgenome_1.20.0 GenomicRanges_1.4.6 # [4] IRanges_1.10.5 XML_3.4-2
• 1.3k views
ADD COMMENT
0
Entering edit mode
@michael-lawrence-3846
Last seen 3.0 years ago
United States
On Fri, Aug 12, 2011 at 1:53 PM, Janet Young <jayoung@fhcrc.org> wrote: > Hi there, > > By trying to do something that's not supposed to be done, I accidentally > found a fatal R-killing issue with import.bw: I tried using import.bw on a > bigWig file while it was still gzipped. > > Is there a reason to gzip a bigWig file? I think it already supports gzipped data blocks (but not a gzipped header, which is why this fails). See: http://comments.gmane.org/gmane.science.biology.ucscgenome.genera l/8900 > You never know, sometimes these things work - I've since found an email > thread that might help me avoid unpacking the file ( > https://stat.ethz.ch/pipermail/bioconductor/2011-April/038734.html ) but > in the meantime I thought I should tell you about how I made R die. > > Code to reproduce this, details and sessionInfo are all below. > > thanks, > > Janmet > > ######## > > library(rtracklayer) > > test.bw.file <- system.file("tests", "test.bw", package = "rtracklayer") > > ## import works, as it's supposed to > bw <- import(test.bw.file, ranges = GenomicRanges::GRanges("chr19", > IRanges(1, 6e7))) > > ### I copied that test.bw file to a local dir and used gzip to compress it > test.bw.file.gz <- "temp/test.bw.gz" > > ### import gives error if I try to read in the gz file directly - that's OK > bw2 <- import(test.bw.file.gz, ranges = GenomicRanges::GRanges("chr19", > IRanges(1, 6e7))) > ## Error in .importForFormat(format) : No import function for 'gz' found > > I just added support for automatic decompression of gzip files, so this would no longer fail, in general. > > ## import.bw also works as its supposed to on the uncompressed file > bw3 <- import.bw(test.bw.file, ranges = GenomicRanges::GRanges("chr19", > IRanges(1, 6e7))) > > ## here's sessionInfo output, before I kill off R: > sessionInfo() > # R version 2.13.1 (2011-07-08) > # Platform: i386-apple-darwin9.8.0/i386 (32-bit) > # > # locale: > # [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 > # > # attached base packages: > # [1] stats graphics grDevices utils datasets methods base > # > # other attached packages: > # [1] rtracklayer_1.12.4 RCurl_1.6-7 bitops_1.0-4.1 > # > # loaded via a namespace (and not attached): > # [1] Biostrings_2.20.2 BSgenome_1.20.0 GenomicRanges_1.4.6 > IRanges_1.10.5 > # [5] XML_3.4-2 > > > ##### now, using import.bw on the gz file makes R crash with no error > report on Mac > bw4 <- import.bw(test.bw.file.gz, ranges = GenomicRanges::GRanges("chr19", > IRanges(1, 6e7))) > > > rtracklayer now catches exceptions thrown by Jim Kent's library, so R should no longer be aborted when things go wrong. In this case, since gzipped files are not supported, R will issue a warning and error to that effect. Thanks for this report, Michael > #### also see the same problem on linux, where my sessionInfo is as > follows: > # R version 2.13.1 (2011-07-08) > # Platform: x86_64-unknown-linux-gnu (64-bit) > # > # locale: > # [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > # [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > # [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 > # [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > # [9] LC_ADDRESS=C LC_TELEPHONE=C > # [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > # > # attached base packages: > # [1] stats graphics grDevices utils datasets methods base > # > # other attached packages: > # [1] rtracklayer_1.12.4 RCurl_1.6-5 bitops_1.0-4.1 > # > # loaded via a namespace (and not attached): > # [1] Biostrings_2.20.2 BSgenome_1.20.0 GenomicRanges_1.4.6 > # [4] IRanges_1.10.5 XML_3.4-2 > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD COMMENT

Login before adding your answer.

Traffic: 525 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6