Here is a query and subject
query = GRanges(c("1:5-10", "1:7-12"))
subject = GRanges(c("1:6-6", "1:10-10"), score=10 * 1:2)
My implementation of your 'slow and cumbersome' is
hits = as(findOverlaps(query, subject), "List")
weightedCount = sum(extractList(subject$score, hits))
which I guess is a bit cumbersome but I think quite fast. In detail... us findOverlaps and coerce the return value to an IntegerList with one element for each query
> hits = as(findOverlaps(query, subject), "List")
> hits
IntegerList of length 2
[[1]] 1 2
[[2]] 2
Use extractList() to reformat the score column in a way that parallels the hits
> extractList(subject$score, hits)
NumericList of length 2
[[1]] 10 20
[[2]] 20
and finally use sum() to accumulate the score for each query element
> sum(extractList(subject$score, hits))
[1] 30 20
The Hits to List coercion is not an obvious one, and sometimes I wonder whether we shoud have
extractByQueryHits(subject, hits)
or something.Thanks! I was unaware that you could coerce a Hits object to list. I was changing everything to a data.table, but your version is slightly faster and much more elegant code-wise.