Entering edit mode
I have two Granges objects that overlap in 90% of positions. I am interested in the positions that do not overlap and are only present in one Granges object, but not the other. How to extract those positions?
I have two Granges objects that overlap in 90% of positions. I am interested in the positions that do not overlap and are only present in one Granges object, but not the other. How to extract those positions?
To do this, one way is via negative indices, as I elaborate in my answer on my Biostars: https://www.biostars.org/p/263214/#337057
So, a reproducible example:
require(GenomicRanges)
gr1 <- makeGRangesFromDataFrame(
data.frame(
chr = rep('chr1',3),
start = c(1,100,500),
end = c(50,200,1000),
gene = c('ENSG1','ENSG2','ENSG3')),
keep.extra.columns = TRUE)
gr2 <- makeGRangesFromDataFrame(
data.frame(
chr=rep('chr1',3),
start=c(100,500,1001),
end=c(200,1000,1050),
gene=c("ENSG2","ENSG3","ENSG4")),
keep.extra.columns=TRUE)
gr1
GRanges object with 3 ranges and 1 metadata column:
seqnames ranges strand | gene
<Rle> <IRanges> <Rle> | <factor>
[1] chr1 1-50 * | ENSG1
[2] chr1 100-200 * | ENSG2
[3] chr1 500-1000 * | ENSG3
gr2
GRanges object with 3 ranges and 1 metadata column:
seqnames ranges strand | gene
<Rle> <IRanges> <Rle> | <factor>
[1] chr1 100-200 * | ENSG2
[2] chr1 500-1000 * | ENSG3
[3] chr1 1001-1050 * | ENSG4
gr1 / gr2
idx_overlaps <- queryHits(findOverlaps(gr1, gr2, type = 'any'))
gr1[-idx_overlaps,]
GRanges object with 1 range and 1 metadata column:
seqnames ranges strand | gene
<Rle> <IRanges> <Rle> | <factor>
[1] chr1 1-50 * | ENSG1
gr2 / gr1
)gr2[-queryHits(findOverlaps(gr2, gr1, type = 'any')),]
GRanges object with 1 range and 1 metadata column:
seqnames ranges strand | gene
<Rle> <IRanges> <Rle> | <factor>
[1] chr1 1001-1050 * | ENSG4
Kevin
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Another option is
subsetByOverlaps(gr1, gr2, invert=TRUE)
. Also,setdiff(gr1, gr2)
will return the positions covered in one but not the other (position-based instead of element-based).