Let us say I have two GenomicRanges objects that look like this:
library(GenomicRanges) gr1 <- GRanges("chr1", IRanges(c(1, 11, 21), c(9, 19, 29), names=paste0("rsid:", letters[1:3])), score=c(6,7,8)) gr2 <- GRanges("chr1", IRanges(c(1,11,31), c(9,19,39), names=paste0("dnase:", letters[1:3])), score=c(0,1,5))
I want to take the union of bins, keeping the metadata, so that I end up with:
GRanges object with 4 ranges and 2 metadata columns: seqnames ranges strand | score score <Rle> <IRanges> <Rle> | <numeric> <numeric> dnase:a chr1 [ 1, 9] * | 0 6 dnase:b chr1 [11, 19] * | 1 7 rsid:c chr1 [21, 29] * | 0 8 dnase:c chr1 [31, 39] * | 5 0
Ps. the final names of the sequences do not matter, ideally they should have none.
Ps2. I have more than two GenomicRanges, so great if you can solve this by calling reduce.
Edit: Currently I am here:
u = union(gr1, gr2) values(u)$gr1 = values(gr1)
But this leads to
Error in `[[<-`(`*tmp*`, name, value = <S4 object of class "DataFrame">) : 3 elements in value to replace 4 elements
With the method I am attempting to use I need some way to only set values on a subset of u.
This does not work for some reason:
elementMetadata(u[start(u) == start(gr1)])$gr1 = elementMetadata(gr1)[,1]
If you do
elementMetadata(u)$gr1 = 0
it works
However, when I try this for gr2 it does not work:
elementMetadata(u)$gr2 = 0 elementMetadata(u[start(u) == start(gr2)])$gr2 = elementMetadata(gr2)[,1]
The result is
GRanges object with 4 ranges and 2 metadata columns: seqnames ranges strand | gr1 gr2 <Rle> <IRanges> <Rle> | <numeric> <numeric> [1] chr1 [ 1, 9] * | 6 0 [2] chr1 [11, 19] * | 7 0 [3] chr1 [21, 29] * | 8 0 [4] chr1 [31, 39] * | 0 0
(both from bioconda)