Introduction and problem
I have multiple (>2) GRanges objects. I want to find those ranges that are shared by x% or more of all GRanges.
Example data
I will provide some example data as dataframes, let's say we want to find those ranges that are shared by 66.7% (2/3) or more.
gr1 <- data.frame(seqnames = rep('chr1', 3),
start = c(1, 10, 20),
end = c(3, 17, 30))
gr2 <- data.frame(seqnames = rep('chr1', 3),
start = c(2, 11, 31),
end = c(3, 19, 35))
gr3 <- data.frame(seqnames = rep('chr1', 3),
start = c(2, 16, 37),
end = c(3, 22, 40))
Output wanted
A Granges output. In the example the algorithm should find:
chr1 2 - 3 Reason: (2-3 is found in gr1, gr2 and gr3, 1 only found in gr1) chr1 11 - 22 Reason: (11-17 is found in gr1 and gr2, 10 only in gr1 ,18-19 in gr2 and gr3, 20 -22 in gr1 and gr3)
What I have done
I know how to find query hits found in all (100%) GRanges, see https://stackoverflow.com/questions/23331475/r-overlap-multiple-granges-with-findoverlaps