On Fri, Jan 23, 2009 at 10:24 PM, Elizabeth Purdom <
epurdom@stat.berkeley.edu> wrote:
> Hi,
>
> I am trying to take overlapping intervals and return a set of
intervals
> that are not overlapping but cover all of the region (and mantain
the
> intervals that don't overlap). In particular, I don't want to merge
> intervals that overlap together (i.e. the reduce function in
IRanges)-- I
> want to cut them up into distinct regions. For example, if I have
intervals:
> [1,6], [4,8], [7,10]
> I want to get back the set of adjacent intervals:
> [1,3],[4,6],[7,8],[9,10]
Well that's a fun one.
ir <- IRanges(c(1, 4, 7), c(6, 8, 10))
adj <- IRanges(sort(unique(c(start(ir), head(end(ir),-1)+1))),
sort(unique(c(end(ir), tail(start(ir),-1)-1))))
... is a not so nice one, but pretty fast..
But if you had a gap in those ranges, like:
ir <- IRanges(c(1, 4, 10), c(6, 8, 10))
So there's a gap at position 9, you would need an additional filtering
step:
adj[adj %in% ir]
This last step requires the devel version of IRanges, but can be
emulated
using !is.na(overlap(ir, adj, multiple=FALSE)).
> The options I find that look like they perhaps do this (intersect or
> setdiff?) seem to be related to the 'normal' ranges class; but this
class
> requires a gap between intervals -- no adjacent intervals -- which
is not
> what I want. Is there a nice way to do this with IRanges (or a not
so nice
> one, but fast)?
>
The intersect and setdiff functions are for any Ranges, normal or not.
They
return normal IRanges though. Perhaps the documentation does not make
this
clear. They probably aren't very useful functions.
>
> Similarly, is there a 'reduce' version that doesn't merge adjacent
> intervals but only truly overlapping ones? There are a lot of
annotation
> examples where you wouldn't not want to merge adjacent intervals
(e.g. UTRs)
>
Try a trick like this:
ir2 <- IRanges(c(1, 5, 7), c(4, 6, 9))
width(ir2) <- width(ir2) - 1
rir2 <- reduce(ir2)
width(rir2) <- width(rir2) + 1
Or find the overlap, reduce those that did overlap and combine that
result
with those that did not overlap.
> Thanks for any assistance!
Thanks for providing more use cases. We'll consider adding
functionality
along these lines to the base package (actually the reduce one has
been on
the TODO list for many months).
>
> Elizabeth Purdom
> Division of Biostatistics
> UC, Berkeley
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor@stat.math.ethz.ch
>
https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
>
http://news.gmane.org/gmane.science.biology.informatics.conductor
>
[[alternative HTML version deleted]]