Entering edit mode
RichardJActon
▴
50
@richardjacton-12268
Last seen 4.3 years ago
I have a GRanges object and I would like to get a list of GRange objects which are subsets of the original object, each of the the objects in this list should have the same position (same chr, start and end values).
library(GenomicRanges) # input: gr=GRanges(seqnames=c("chr1","chr1","chr2","chr2"), ranges=IRanges(start=c(50,50,200,200),end=c(100,100,300,300)), strand=c("+","+","-","-"), scores=c(0.5,0.5,0.5,0.5) ) gr >GRanges object with 4 ranges and 1 metadata column: > seqnames ranges strand | scores > | > [1] chr1 [ 50, 100] + | 0.5 > [2] chr1 [ 50, 100] + | 0.5 > [3] chr2 [200, 300] - | 0.5 > [4] chr2 [200, 300] - | 0.5 > ------- > seqinfo: 2 sequences from an unspecified genome; no seqlengths # the desired output is an object of the form of ls gr1=GRanges(seqnames=c("chr1","chr1"), ranges=IRanges(start=c(50,50),end=c(100,100)), strand=c("+","+") ) gr1 >GRanges object with 2 ranges and 0 metadata columns: > seqnames ranges strand > > [1] chr1 [50, 100] + > [2] chr1 [50, 100] + > ------- > seqinfo: 1 sequence from an unspecified genome; no seqlengths gr2=GRanges(seqnames=c("chr2","chr2"), ranges=IRanges(start=c(200,200),end=c(300,300)), strand=c("-","-"), scores=c(0.5,0.5) ) gr2 >GRanges object with 2 ranges and 1 metadata column: > seqnames ranges strand | scores > | > [1] chr2 [200, 300] - | 0.5 > [2] chr2 [200, 300] - | 0.5 > ------- > seqinfo: 1 sequence from an unspecified genome; no seqlengths ls <- list(gr1,gr2) ls >[[1]] >GRanges object with 2 ranges and 0 metadata columns: > seqnames ranges strand > > [1] chr1 [50, 100] + > [2] chr1 [50, 100] + > ------- > seqinfo: 1 sequence from an unspecified genome; no seqlengths > >[[2]] >GRanges object with 2 ranges and 1 metadata column: > seqnames ranges strand | scores > | > [1] chr2 [200, 300] - | 0.5 > [2] chr2 [200, 300] - | 0.5 > ------- > seqinfo: 1 sequence from an unspecified genome; no seqlengths
For example in the excerpt from the GRanges object above the first 2 lines should be in a new GRanges object in a list of GRange objects and the last 2 lines in a second GRanges object in that list.
Any suggestions?
Yea I noticed the warning and I'm working on it. In theory, we could convert GRanges to a grouping more efficiently than the character coercion, using the 4-way integer hash.
Thanks Mike,
That seems to do exactly what I wanted.
I'm not sure I understand how though.
looking at the output of `as.factor(gr)` I can see why it works with as.factor but not why it would work without?
I had not expected that behaviour from as.factor as applied to a GRanges object, thought the presence of metadata might cause issues but it seems to be working on the IRanges bit, which makes some sense in retrospect. This is a very useful behaviour to be aware of.