Entering edit mode
Janet Young
▴
740
@janet-young-2360
Last seen 5.1 years ago
Fred Hutchinson Cancer Research Center,…
Hi there,
I have a request for findOverlaps (GenomicRanges) - hopefully it's an
easy one.
Is it possible to implement the ignore.strand options for findOverlaps
calls where we're comparing a query GRanges with itself? The reason
I ask is that I'm looking through a set of genes to find pairs that
overlap on opposite strands. Below is some code that should explain it
(I'm using the devel packages).
thanks very much,
Janet
##### GRanges
library(GenomicRanges)
## an example GRanges object, taken from the findOverlaps-methods
{GenomicRanges} help page:
gr <-
GRanges(seqnames =
Rle(c("chr1", "chr2", "chr1", "chr3"), c(1, 3, 2, 4)),
ranges =
IRanges(1:10, width = 10:1, names = head(letters,10)),
strand =
Rle(strand(c("-", "+", "*", "+", "-")),
c(1, 2, 2, 3, 2)),
score = 1:10,
GC = seq(1, 0, length=10))
## findOverlaps works, of course, finds 24 hits
findOverlaps(gr)
## ignoreSelf and ignoreRedundant are useful: gives me just 7 useful
pairs to explore more:
findOverlaps(gr, ignoreSelf=TRUE, ignoreRedundant=TRUE)
## but I'm not getting hits for overlaps on opposite strands - I'd
like to use ignore.strand, but it only works if we supply both query
and subject. When I suppy gr as both query and subject, I get 34
pairs ignoring the strand:
findOverlaps(gr, ignore.strand=TRUE)
# Error in .local(query, subject, maxgap, minoverlap, type, select,
...) :
# unused argument (ignore.strand = TRUE)
findOverlaps(gr, gr, ignore.strand=TRUE)
## but now that I'm supplying the subject, I can't use the other two
useful options (ignoreSelf and ignoreRedundant) that help me quickly
get the pairs I'd like to explore more
findOverlaps(gr, gr, ignore.strand=TRUE, ignoreSelf=TRUE,
ignoreRedundant=TRUE)
# Error in .local(query, subject, maxgap, minoverlap, type, select,
...) :
# unused arguments (ignoreSelf = TRUE, ignoreRedundant = TRUE)
sessionInfo()
R version 3.1.0 Patched (2014-05-26 r65771)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats graphics grDevices utils datasets
methods
[8] base
other attached packages:
[1] GenomicRanges_1.17.17 GenomeInfoDb_1.1.6 IRanges_1.99.15
[4] S4Vectors_0.0.8 BiocGenerics_0.11.2
loaded via a namespace (and not attached):
[1] stats4_3.1.0 XVector_0.5.6