Entering edit mode
Janet Young
▴
740
@janet-young-2360
Last seen 5.1 years ago
Fred Hutchinson Cancer Research Center,…
Hi again,
I'm digging in to rtracklayer more, and find another weird issue. I
have a big query that I know fails. I then run a second, smaller query
that succeeds. If I then run the first big query again, if appears to
work and returns the result from the second small query, even if I try
deleting the second small query and its result. Again, I hope the full
code below will explain. It seems like something is being kept in
memory that shouldn't be - does this make any sense?
thanks,
Janet
library(rtracklayer)
library(GenomicRanges)
session <- browserSession("UCSC")
genome(session) <- "hg19"
#### make some sample ranges - a large number of small ranges:
numRanges <- 50000
rangeWidths <- 50
myRanges <- GRanges( seqnames=rep("chr1",numRanges),
ranges=IRanges(start=1:numRanges*rangeWidths*2,width=rangeWidths) )
#### run a query - this one is too big, and fails (I already emailed
about this error yesterday):
query <- ucscTableQuery (session, "cons46way", range=myRanges)
tableName(query) <- "phyloP46wayPrimates"
scores <- track(query)
## here's the error:
Error in .Call2("solve_user_SEW0", start, end, width, PACKAGE =
"IRanges") :
solving row 100001: range cannot be determined from the supplied
arguments (too many NAs)
In addition: Warning messages:
1: In matrix(as.numeric(unlist(split_lines)), nrow = 2) :
NAs introduced by coercion
2: In matrix(as.numeric(unlist(split_lines)), nrow = 2) :
data length [200005] is not a sub-multiple or multiple of the number
of rows [2]
### now run a small query that works
query1 <- ucscTableQuery (session, "cons46way",
range=myRanges[201:210])
tableName(query1) <- "phyloP46wayPrimates"
scores1 <- track(query1)
length(scores1)
# [1] 500
### now run first query again (the one that failed) - this time it
appears to work and returns the same result as query1
query2 <- ucscTableQuery (session, "cons46way", range=myRanges)
tableName(query2) <- "phyloP46wayPrimates"
scores2 <- track(query2)
length(scores2)
# [1] 500
identical(scores1, scores2)
# [1] TRUE
#### even if I remove all the queries and results from before, the big
query that would normally fail is still returning results of the
second small query. Something is not being reset that should be:
rm(query, query1,scores1,query2,scores2,numRanges,rangeWidths)
ls()
#[1] "myRanges" "numRanges" "rangeWidths" "session"
query3 <- ucscTableQuery (session, "cons46way", range=myRanges)
tableName(query3) <- "phyloP46wayPrimates"
scores3 <- track(query3)
length(scores3)
# [1] 500
##################
sessionInfo()
R version 3.0.1 Patched (2013-07-29 r63455)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats graphics grDevices utils datasets
methods
[8] base
other attached packages:
[1] rtracklayer_1.21.9 GenomicRanges_1.13.35 XVector_0.1.0
[4] IRanges_1.19.19 BiocGenerics_0.7.3
loaded via a namespace (and not attached):
[1] Biostrings_2.29.14 bitops_1.0-5 BSgenome_1.29.1
RCurl_1.95-4.1
[5] Rsamtools_1.13.26 stats4_3.0.1 tools_3.0.1
XML_3.98-1.1
[9] zlibbioc_1.7.0