Maximal length of Rle vectors
1
0
Entering edit mode
@hans-ulrich-klein-1945
Last seen 13 months ago
United States
Dear all, I observed this problem regarding the maximal length of a Rle vector: > rle = Rle(rep(0, 1000000000)) > length(rle) [1] 1000000000 > length(c(rle, rle, rle)) [1] -1294967296 Probably, it is caused by the maximum positive number (~2.1E9) that can be represented by an integer variable. However, there is no warning message. I noticed this problem when I wanted to calculate the average coverage of a sequencing project accross the human genome. I used the coverage() method and then concatenated all chromosomes. This should give me an Rle vector of length ~3*109, but mean() does not work on that vector. Best, Hans-Ulrich > sessionInfo() R version 2.14.0 (2011-10-31) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] IRanges_1.12.3 [[alternative HTML version deleted]]
Sequencing Sequencing • 1.0k views
ADD COMMENT
0
Entering edit mode
@herve-pages-1542
Last seen 2 days ago
Seattle, WA, United States
Hi Hans-Ulrich, Thanks for the bug report. A fix is on its way. It will raise an error when one is trying to create an Rle with length > .Machine$integer.max. Allowing an Rle to have a length > .Machine$integer.max, even with a warning, would cause all sort of problems, the first of them being that its length would be NA: > Rle(1:2, c(1500000000, 1500000000)) 'integer' Rle of length NA with 2 runs Lengths: 1500000000 1500000000 Values : 1 2 Warning message: In sum(runLength(x)) : Integer overflow - use sum(as.numeric(.)) Note that the coverage accross the human genome is best represented by a named RleList (with one element per chromosome), which doesn't have the .Machine$integer.max limitation. See the "GenomicRanges Use Cases" vignette in the GenomicRanges packages for an illustration of this. Cheers, H. On 11-11-30 09:28 AM, Hans-Ulrich Klein wrote: > Dear all, > > I observed this problem regarding the maximal length of a Rle vector: > > > rle = Rle(rep(0, 1000000000)) > > length(rle) > [1] 1000000000 > > length(c(rle, rle, rle)) > [1] -1294967296 > > > Probably, it is caused by the maximum positive number (~2.1E9) that can > be represented by an integer variable. However, there is no warning > message. > I noticed this problem when I wanted to calculate the average coverage > of a sequencing project accross the human genome. I used the coverage() > method and then concatenated all chromosomes. This should give me an Rle > vector of length ~3*109, but mean() does not work on that vector. > > Best, > Hans-Ulrich > > > > sessionInfo() > R version 2.14.0 (2011-10-31) > Platform: x86_64-pc-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=C LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] IRanges_1.12.3 > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319
ADD COMMENT

Login before adding your answer.

Traffic: 1050 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6