Problem
I am using the package `[IRanges][1]` and am in need to accurately code for very long sequences that overpass 2^31 by about 10-fold.
From the following, it seems that `IRanges` uses `int32`
##### INSTALLATION FROM SRC CODE ######
## try http:// if https:// URLs are not supported
source("https://bioconductor.org/biocLite.R")
biocLite("IRanges")
##### CALL PACKAGE #####
require(IRanges)
IRanges(start=1,end=2^31-1) # Works fine
IRanges(start=1,end=2^31) # Fail
Error in .Call2("solve_user_SEW0", start, end, width, PACKAGE = "IRanges") :
solving row 1: range cannot be determined from the supplied arguments (too many NAs)
In addition: Warning message:
In .normargSEW0(end, "end") : NAs introduced by coercion to integer range
As this package is often used for DNA sequences, It would be very useful to be able to be able to deal with values that are greater than 2^32 (≈ 10^9) as many organisms have genome size longer than that.
Question
- Am I right to think that this is an integer overflow issue?
- Do you encounter the same issue?
- Is there a way around this problem?
The only solution I found is to accept to reduce my level of accuracy and divide each width by 100... but I am not very happy with decreasing my accuracy.
My R version
R version 3.2.3 (2015-12-10) -- "Wooden Christmas-Tree"
Copyright (C) 2015 The R Foundation for Statistical Computing
Platform: x86_64-apple-darwin13.4.0 (64-bit)
[1]: http://bioconductor.org/packages/release/bioc/html/IRanges.html