Hi Carlos,
IMPORTANT: When you unlist cvg
, you end up with a single Rle object representing the coverage along a single virtual sequence made of the concatenation of all the chromosomes in the genome. So the coordinate system you need to use to specify ranges along that single virtual sequence is not the same as before the unlisting. Before the unlisting, it was a local one i.e. 1 coordinate system per-chromosome. After the unlisting, it becomes a global one i.e. 1 single coordinate system along 1 single virtual sequence.
That means you can't just unlist the ranges in cvg_cds
and use them as-is to define views on unlist(cvg)
. That gives you incorrect view coordinates. Instead you need to do something like:
cds_views <- Views(unlist(cvg), absoluteRanges(unlist(cvg_cds)))
See ?absoluteRanges
for the details.
Now back to "how to extract the read coverage in a genome over a list of ranges"
Using views for this is not as convenient as it could be at the moment but we will try to improve the situation in the near future by implementing containers for genomic views (more details on the bioc-devel mailing list here https://stat.ethz.ch/pipermail/bioc-devel/2016-June/009433.html if you are curious about this).
Anyway, right now this can be done by just subsetting your RleList object (cvg
in your case) with the GRanges object containing your ranges of interest:
cds <- unlist(cvg_cds, use.names=FALSE)
Note that before we can subset cvg
with cds
we need to drop the sequence levels (i.e. chromosomes) from the latter that are not represented in the former:
seqlevels(cds, force=TRUE) <- names(cvg)
Then we can subset:
cvg[cds] # RleList object with 1 list element per range in cds
Last but not least: You might want to check coverageByTranscript()
in GenomicFeatures. It might actually do exactly what you want.
Cheers,
H.