Genomic Ranges. Ignore strandedness in findOverlap function.
1
0
Entering edit mode
Fahim Md ▴ 250
@fahim-md-4018
Last seen 10.4 years ago
Hi findOverlap function depends on the 'strand'edness of the subject and query. In the example below, I am using findOverlap function to find overlapping ranges. Is there a way to completely ignore this feature and report all overlaps. grSubject <- GRanges( seqnames =Rle(rep("chr1", 7)), ranges = IRanges(c(1,5,15,22,5,15,22),c(11,8,20,30,8,17,25)), strand = Rle(strand(c("-", rep("+",6)))), name = c('a', 'b','b','b','c','c','c') ) > grSubject GRanges with 7 ranges and 1 elementMetadata value: seqnames ranges strand | name <rle> <iranges> <rle> | <character> [1] chr1 [ 1, 11] - | a [2] chr1 [ 5, 8] + | b [3] chr1 [15, 20] + | b [4] chr1 [22, 30] + | b [5] chr1 [ 5, 8] + | c [6] chr1 [15, 17] + | c [7] chr1 [22, 25] + | c --- seqlengths: chr1 NA grQuery <- GRanges( seqnames =Rle(rep("chr1", 3)), ranges = IRanges(c(6,14,22),c(10,16,27)), strand = Rle(strand(rep("-",3))), name = c('x', 'y','z') ) > grQuery GRanges with 3 ranges and 1 elementMetadata value: seqnames ranges strand | name <rle> <iranges> <rle> | <character> [1] chr1 [ 6, 10] - | x [2] chr1 [14, 16] - | y [3] chr1 [22, 27] - | z --- seqlengths: chr1 NA as.matrix(findOverlaps(grQuery, grSubject)) query subject [1,] 1 1 When I change the strand of the query like this: grQuery <- GRanges( seqnames =Rle(rep("chr1", 3)), ranges = IRanges(c(6,14,22),c(10,16,27)), strand = Rle(strand(c("-", rep("+",2)))), name = c('x', 'y','z') ) I get > as.matrix(findOverlaps(grQuery, grSubject)) query subject [1,] 1 1 [2,] 2 3 [3,] 2 6 [4,] 3 4 [5,] 3 7 > Thanks Fahim [[alternative HTML version deleted]]
• 2.8k views
ADD COMMENT
0
Entering edit mode
@steve-lianoglou-2771
Last seen 22 months ago
United States
Hi Fahim, On Fri, Jan 6, 2012 at 4:08 PM, Fahim Mohammad <fahim.md at="" gmail.com=""> wrote: > Hi > findOverlap function depends on the 'strand'edness of the subject and > query. In the example below, I am using findOverlap function to find > overlapping ranges. ?Is there a way to completely ignore this feature and > report all overlaps. [snip] Do this in your R workspace: R> library(GenomicRanges) R> getMethod('findOverlaps', c("GenomicRanges", "GenomicRanges")) And notice that the findOverlaps method when called on GenomicRanges objects has an `ignore.strand` argument that you can set to TRUE to get what you are want, so, do: R> o <- findOverlaps(grQuery, grSubject, ignore.strand=TRUE) HTH, -steve > grSubject <- > ?GRanges( > seqnames =Rle(rep("chr1", 7)), > ranges = IRanges(c(1,5,15,22,5,15,22),c(11,8,20,30,8,17,25)), > strand = Rle(strand(c("-", rep("+",6)))), > name = c('a', 'b','b','b','c','c','c') > ) >> grSubject > GRanges with 7 ranges and 1 elementMetadata value: > ? ? ?seqnames ? ?ranges strand | ? ? ? ?name > ? ? ? ? <rle> <iranges> ?<rle> | <character> > ?[1] ? ? chr1 ?[ 1, 11] ? ? ?- | ? ? ? ? ? a > ?[2] ? ? chr1 ?[ 5, ?8] ? ? ?+ | ? ? ? ? ? b > ?[3] ? ? chr1 ?[15, 20] ? ? ?+ | ? ? ? ? ? b > ?[4] ? ? chr1 ?[22, 30] ? ? ?+ | ? ? ? ? ? b > ?[5] ? ? chr1 ?[ 5, ?8] ? ? ?+ | ? ? ? ? ? c > ?[6] ? ? chr1 ?[15, 17] ? ? ?+ | ? ? ? ? ? c > ?[7] ? ? chr1 ?[22, 25] ? ? ?+ | ? ? ? ? ? c > ?--- > ?seqlengths: > ? chr1 > ? ? NA > > > grQuery <- > ?GRanges( > seqnames =Rle(rep("chr1", 3)), > ranges = IRanges(c(6,14,22),c(10,16,27)), > strand = Rle(strand(rep("-",3))), > name = c('x', 'y','z') > ) >> grQuery > GRanges with 3 ranges and 1 elementMetadata value: > ? ? ?seqnames ? ?ranges strand | ? ? ? ?name > ? ? ? ? <rle> <iranges> ?<rle> | <character> > ?[1] ? ? chr1 ?[ 6, 10] ? ? ?- | ? ? ? ? ? x > ?[2] ? ? chr1 ?[14, 16] ? ? ?- | ? ? ? ? ? y > ?[3] ? ? chr1 ?[22, 27] ? ? ?- | ? ? ? ? ? z > ?--- > ?seqlengths: > ? chr1 > ? ? NA > > as.matrix(findOverlaps(grQuery, grSubject)) > ?query subject > [1,] ? ? 1 ? ? ? 1 > > > When I change the strand of the query like this: > grQuery <- > GRanges( > seqnames =Rle(rep("chr1", 3)), > ranges = IRanges(c(6,14,22),c(10,16,27)), > strand = Rle(strand(c("-", rep("+",2)))), > name = c('x', 'y','z') > ) > > I get >> as.matrix(findOverlaps(grQuery, grSubject)) > ? ? query subject > [1,] ? ? 1 ? ? ? 1 > [2,] ? ? 2 ? ? ? 3 > [3,] ? ? 2 ? ? ? 6 > [4,] ? ? 3 ? ? ? 4 > [5,] ? ? 3 ? ? ? 7 >> > > > Thanks > > Fahim > > ? ? ? ?[[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Steve Lianoglou Graduate Student: Computational Systems Biology ?| Memorial Sloan-Kettering Cancer Center ?| Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
ADD COMMENT
0
Entering edit mode
Hi Steve I can see that statement in the findOverlaps method. Do I need to write my own function to do this as 'ignore.strand' is not an allowed argument in standard function. What are the steps to do this? Thanks Fahim On Fri, Jan 6, 2012 at 4:41 PM, Steve Lianoglou < mailinglist.honeypot@gmail.com> wrote: > Hi Fahim, > > > On Fri, Jan 6, 2012 at 4:08 PM, Fahim Mohammad <fahim.md@gmail.com> wrote: > > Hi > > findOverlap function depends on the 'strand'edness of the subject and > > query. In the example below, I am using findOverlap function to find > > overlapping ranges. Is there a way to completely ignore this feature and > > report all overlaps. > > [snip] > > Do this in your R workspace: > > R> library(GenomicRanges) > R> getMethod('findOverlaps', c("GenomicRanges", "GenomicRanges")) > > And notice that the findOverlaps method when called on GenomicRanges > objects has an `ignore.strand` argument that you can set to TRUE to > get what you are want, so, do: > > R> o <- findOverlaps(grQuery, grSubject, ignore.strand=TRUE) > > HTH, > -steve > > > grSubject <- > > GRanges( > > seqnames =Rle(rep("chr1", 7)), > > ranges = IRanges(c(1,5,15,22,5,15,22),c(11,8,20,30,8,17,25)), > > strand = Rle(strand(c("-", rep("+",6)))), > > name = c('a', 'b','b','b','c','c','c') > > ) > >> grSubject > > GRanges with 7 ranges and 1 elementMetadata value: > > seqnames ranges strand | name > > <rle> <iranges> <rle> | <character> > > [1] chr1 [ 1, 11] - | a > > [2] chr1 [ 5, 8] + | b > > [3] chr1 [15, 20] + | b > > [4] chr1 [22, 30] + | b > > [5] chr1 [ 5, 8] + | c > > [6] chr1 [15, 17] + | c > > [7] chr1 [22, 25] + | c > > --- > > seqlengths: > > chr1 > > NA > > > > > > grQuery <- > > GRanges( > > seqnames =Rle(rep("chr1", 3)), > > ranges = IRanges(c(6,14,22),c(10,16,27)), > > strand = Rle(strand(rep("-",3))), > > name = c('x', 'y','z') > > ) > >> grQuery > > GRanges with 3 ranges and 1 elementMetadata value: > > seqnames ranges strand | name > > <rle> <iranges> <rle> | <character> > > [1] chr1 [ 6, 10] - | x > > [2] chr1 [14, 16] - | y > > [3] chr1 [22, 27] - | z > > --- > > seqlengths: > > chr1 > > NA > > > > as.matrix(findOverlaps(grQuery, grSubject)) > > query subject > > [1,] 1 1 > > > > > > When I change the strand of the query like this: > > grQuery <- > > GRanges( > > seqnames =Rle(rep("chr1", 3)), > > ranges = IRanges(c(6,14,22),c(10,16,27)), > > strand = Rle(strand(c("-", rep("+",2)))), > > name = c('x', 'y','z') > > ) > > > > I get > >> as.matrix(findOverlaps(grQuery, grSubject)) > > query subject > > [1,] 1 1 > > [2,] 2 3 > > [3,] 2 6 > > [4,] 3 4 > > [5,] 3 7 > >> > > > > > > Thanks > > > > Fahim > > > > [[alternative HTML version deleted]] > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@r-project.org > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > -- > Steve Lianoglou > Graduate Student: Computational Systems Biology > | Memorial Sloan-Kettering Cancer Center > | Weill Medical College of Cornell University > Contact Info: http://cbio.mskcc.org/~lianos/contact > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Thanks Steve. I resolved the problem. Actually I was using ignore.strand option with RangedData object. Thanks again. Fahim On Fri, Jan 6, 2012 at 4:41 PM, Steve Lianoglou < mailinglist.honeypot@gmail.com> wrote: > Hi Fahim, > > > On Fri, Jan 6, 2012 at 4:08 PM, Fahim Mohammad <fahim.md@gmail.com> wrote: > > Hi > > findOverlap function depends on the 'strand'edness of the subject and > > query. In the example below, I am using findOverlap function to find > > overlapping ranges. Is there a way to completely ignore this feature and > > report all overlaps. > > [snip] > > Do this in your R workspace: > > R> library(GenomicRanges) > R> getMethod('findOverlaps', c("GenomicRanges", "GenomicRanges")) > > And notice that the findOverlaps method when called on GenomicRanges > objects has an `ignore.strand` argument that you can set to TRUE to > get what you are want, so, do: > > R> o <- findOverlaps(grQuery, grSubject, ignore.strand=TRUE) > > HTH, > -steve > > > grSubject <- > > GRanges( > > seqnames =Rle(rep("chr1", 7)), > > ranges = IRanges(c(1,5,15,22,5,15,22),c(11,8,20,30,8,17,25)), > > strand = Rle(strand(c("-", rep("+",6)))), > > name = c('a', 'b','b','b','c','c','c') > > ) > >> grSubject > > GRanges with 7 ranges and 1 elementMetadata value: > > seqnames ranges strand | name > > <rle> <iranges> <rle> | <character> > > [1] chr1 [ 1, 11] - | a > > [2] chr1 [ 5, 8] + | b > > [3] chr1 [15, 20] + | b > > [4] chr1 [22, 30] + | b > > [5] chr1 [ 5, 8] + | c > > [6] chr1 [15, 17] + | c > > [7] chr1 [22, 25] + | c > > --- > > seqlengths: > > chr1 > > NA > > > > > > grQuery <- > > GRanges( > > seqnames =Rle(rep("chr1", 3)), > > ranges = IRanges(c(6,14,22),c(10,16,27)), > > strand = Rle(strand(rep("-",3))), > > name = c('x', 'y','z') > > ) > >> grQuery > > GRanges with 3 ranges and 1 elementMetadata value: > > seqnames ranges strand | name > > <rle> <iranges> <rle> | <character> > > [1] chr1 [ 6, 10] - | x > > [2] chr1 [14, 16] - | y > > [3] chr1 [22, 27] - | z > > --- > > seqlengths: > > chr1 > > NA > > > > as.matrix(findOverlaps(grQuery, grSubject)) > > query subject > > [1,] 1 1 > > > > > > When I change the strand of the query like this: > > grQuery <- > > GRanges( > > seqnames =Rle(rep("chr1", 3)), > > ranges = IRanges(c(6,14,22),c(10,16,27)), > > strand = Rle(strand(c("-", rep("+",2)))), > > name = c('x', 'y','z') > > ) > > > > I get > >> as.matrix(findOverlaps(grQuery, grSubject)) > > query subject > > [1,] 1 1 > > [2,] 2 3 > > [3,] 2 6 > > [4,] 3 4 > > [5,] 3 7 > >> > > > > > > Thanks > > > > Fahim > > > > [[alternative HTML version deleted]] > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@r-project.org > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > -- > Steve Lianoglou > Graduate Student: Computational Systems Biology > | Memorial Sloan-Kettering Cancer Center > | Weill Medical College of Cornell University > Contact Info: http://cbio.mskcc.org/~lianos/contact > [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 537 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6