Hi :
I have set of genomic interval in GRanges objects, and I did filter out each GRanges object into two distinctive set based on score column. I expect each GRanges objects must have Confirm, Discard set after filtration. My approach works for me, turns out its output format is bit of undesired and need to do simplification. I bet there must be better way to achieve efficient output format for filtering on big GRanges objects. Can anyone point me how to solve this issue easily? Thanks a lot !
Note: toy data only explain how my real data looks like, so it is simulated based on structure of my dataset.
# toy data
grs <- GRangesList( foo = GRanges( seqnames=Rle("chr1", 3),ranges=IRanges(c(2,7,16), c(5,14,20)), rangeName=c("a1", "a2", "a3"), score=c(4, 6,9)), bar = GRanges(seqnames=Rle("chr1", 3),ranges=IRanges(c(4,13,26), c(11,17,28)), rangeName=c("b1", "b2", "b3"), score=c(11, 7, 8)), bleh = GRanges(seqnames=Rle("chr1", 4),ranges=IRanges(c(1,4,10, 23), c(3,8,14, 29)), rangeName=c("c1", "c2", "c3", "c4"), score= c(4, 6, 3, 8)) )
so I come up this, turns out it is bit of difficult form, and I am stuck with its simplification :
res <- lapply(grs, function(x) split(x, c("Confirm", "Discard")[(x$score > 6)+1]))
I want to simplify because of this reason:
> res[1]
I want to compare res[[1]]$Confirm
with res[[1]]$Discard,
for example, assume that one regions both existed in Confirm,and Discard set, then I am gonna remove this instances. I think it's better to get out nested list first, detach nested list as individual list and access its subset respectively
How can perform this simplification? Does anyone knows any trick of doing this manipulation?
Best regards:
Jurat
Does calling unlist within lapply() return what you want?
Hi Diego:
Thanks your respond on my post. Yes, your answer is good move and quite close to my expected output format. I think it would be more clean to add Confirm/Discard as new metadata column on next to score column. How can I make this happen? What if I try to extract out Confirmed regions, and Discarded Regions in each GRanges and compare each other, How can I get done this? Thanks your kind help !
Best regards:
Jurat
Below is an attempt to do what I understood you want. For some reason I can't reassign a `mcols` object in `GRanges` with `<-`, making the code more complicated than I was hoping. I use `cut` to classify scores based on whether they are between 0 and 6 or larger. You can tune those values to your liking. This is highly likely not the best way to do it. Hopefully someone else will point out a better option:
Daer Diego :
Thanks again for your effort on this question. So far, your approach at least give me hope to get over this question. Thank you so much !
Jurat