coverage() in IRanges gives >0 coverage when weight == 0
1
0
Entering edit mode
Ludo Pagie ▴ 40
@ludo-pagie-6130
Last seen 8.8 years ago
Hi all, I don't understand what's going on here.I'm trying to compute the coverage of a few ranges. I specify a weight for these ranges and for some positions I get a coverage even though I know these positions are only covered by ranges which have 0 weight. Worse, the value I do get for these positions depends on the number of ranges I feed into the function. Here's some code using extracts of the data I'm working on: ################################################################ ################################################################ # funny way to make the same RangedData object I have ranges <- as(structure(list(space = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L ), .Label = "CH321-10J13_YELLOW", class = "factor"), start = c(13449753L, 13449972L, 13449972L, 13450037L, 13450038L, 13450039L, 13450039L, 13450042L, 13450169L, 13450193L, 13450198L, 13450253L, 13450284L, 13450296L, 13450308L, 13450308L, 13450522L, 13450690L, 13450690L, 13450870L, 13450997L, 13451104L, 13451133L, 13451133L, 13451142L, 13451147L, 13451215L, 13451216L, 13451513L, 13451788L, 13451812L, 13451812L, 13451902L, 13451961L, 13451961L, 13452337L, 13452955L, 13453049L, 13453053L), end = c(13451753L, 13451573L, 13451573L, 13451908L, 13451921L, 13451700L, 13451700L, 13451675L, 13452208L, 13451895L, 13451971L, 13451731L, 13451553L, 13452172L, 13451502L, 13451502L, 13451806L, 13452630L, 13452630L, 13452711L, 13452527L, 13452571L, 13451378L, 13451378L, 13452533L, 13453030L, 13452281L, 13452439L, 13453079L, 13453360L, 13452603L, 13452603L, 13453506L, 13452789L, 13452789L, 13453123L, 13453484L, 13453773L, 13453387L ), width = c(2001L, 1602L, 1602L, 1872L, 1884L, 1662L, 1662L, 1634L, 2040L, 1703L, 1774L, 1479L, 1270L, 1877L, 1195L, 1195L, 1285L, 1941L, 1941L, 1842L, 1531L, 1468L, 246L, 246L, 1392L, 1884L, 1067L, 1224L, 1567L, 1573L, 792L, 792L, 1605L, 829L, 829L, 787L, 530L, 725L, 335L), count = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)), .Names = c("space", "start", "end", "width", "count"), row.names = c(NA, -39L), class = "data.frame"),'RangedData') # make the weight vector count <- c(3.790678721025, 3.790678721025, 0, 0, 0, 1.263559573675, 2.52711914735, 6.317797868375, 159.20850628305, 0, 156.6813871357, 1.263559573675, 162.999185004075, 2.52711914735, 1139.73073545485, 0, 0, 0, 304.517857255675, 5.0542382947, 1.263559573675, 3.790678721025, 1.263559573675, 0, 6.317797868375, 32.85254891555, 0, 451.090767801975, 0, 17.68983403145, 24.007631899825, 0, 46.751704225975, 0, 0, 0, 0, 0, 0) # my shift argument shift <- structure(list(`CH321-10J13_YELLOW` = -13421338), .Names = "CH321-10J13_YELLOW") # I know base position 13453507 is covered by a single range, which has weight 0 mypos <- 13453507 # which ranges overlap with mypos: which(start(ranges) <= mypos & end(ranges) >= mypos) # 38 # this range has weight 0 count[38] # 0 # compute coverage and report the coverage at mypos: coverage(ranges, weight=list(count))[[1]][mypos] #numeric-Rle of length 1 with 1 run # Lengths: 1 # Values : -5.61328761250479e-13 # if I reduce the number of ranges going into the computation I get different results coverage(ranges[20:39,], weight=list(count[20:39]))[[1]][mypos] #numeric-Rle of length 1 with 1 run # Lengths: 1 # Values : 7.105427357601e-15 coverage(ranges[35:39,], weight=list(count[35:39]))[[1]][mypos] #numeric-Rle of length 1 with 1 run # Lengths: 1 # Values : 0 # sessionInfo: R version 3.0.1 (2013-05-16) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] parallel stats graphics grDevices utils datasets methods [8] base other attached packages: [1] IRanges_1.18.3 BiocGenerics_0.6.0 loaded via a namespace (and not attached): [1] stats4_3.0.1 tools_3.0.1 ################################################################ ################################################################ I get the same issue in another R-version: R version 2.15.1 (2012-06-22) Platform: x86_64-pc-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets [6] methods base other attached packages: [1] IRanges_1.16.6 BiocGenerics_0.4.0 [3] BiocInstaller_1.8.3 gdata_2.13.2 [5] gtools_3.0.0 loaded via a namespace (and not attached): [1] KernSmooth_2.23-10 parallel_2.15.1 stats4_2.15.1 [4] tools_2.15.1 Is this the result of precision issues? Or is it a bug? Or am I overlooking something else? Thanks for your help, Ludo
Coverage Coverage • 941 views
ADD COMMENT
0
Entering edit mode
Charles Berry ▴ 290
@charles-berry-5754
Last seen 5.7 years ago
United States
Ludo Pagie <ludo.pagie at="" ...=""> writes: > > Hi all, > > I don't understand what's going on here.I'm trying to compute the > coverage of a few ranges. I specify a weight for these ranges and for > some positions I get a coverage even though I know these positions are > only covered by ranges which have 0 weight. > Worse, the value I do get for these positions depends on the number of > ranges I feed into the function. Here's some code using extracts of > the data I'm working on: > See R FAQ 7.31 Why doesn't R think these numbers are equal? coverage() uses floating point arithmetic to get the value and is subject to round-off error. The amount of round-off error depends on the particulars, so you will get different values for (a+b) - a depending on what b is > (1e5+1e-10)-1e5 [1] 1.018634e-10 > (1e6+1e-10)-1e6 [1] 1.164153e-10 >
ADD COMMENT

Login before adding your answer.

Traffic: 858 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6