hi all,
i'm interested in building a table of counts from BAM files produced by a PE RNA-seq experiment, for RepeatMasker annotations. essentially to know how many RNA-seq reads map to a particular LTR, etc. i've searched for those annotations among the Bioconductor annotation packages but couldn't find anything with the keyword repeat. searching on the support site led me to this Rsubread::featureCounts extremely slow on some annotations on the performance of featureCounts for that purpose but it has no details about how the annotations are input to the software. i'm familiar with using 'summarizeOverlaps()' from the GenomicAlignments package to build a table of counts from gene annotations, which i find it convenient because the input is a 'TxDb' object and the output is a 'SummarizedExperiment' object. any hint about how to do this in Bioconductor will be very appreciated.
thanks!!
robert.
thanks for the tip. i've seen that section 5.1 of the vignette of rtracklayer contains an example about importing repeatmasker annotations for genomic regions. however, because i need to work with the whole set of annotations, i can't use that approach since it would take too long. i've downloaded the rmsk.txt.gz file directly from UCSC but now the question is whether this can be imported into R/BioC with rtracklayer. i've tried the 'import()' function with different formats and none seem to work. before i end up writing my own parser, is there a way to import the rmsk.txt.gz file with rtracklayer? thanx.
Probably could just read it with read.table() and then coerce to GRanges.
True, that was it, thanks!