Dear Bioconductors,
This is a bit of a curiosity question. I have been working with TxDb.Hsapiens.UCSC.hg19.knownGene package and noticed that there are some exons, that do not seem to be a part of any gene.
> # get all the genes > genic.regions <- genes(TxDb.Hsapiens.UCSC.hg19.knownGene) > # get all the exons > exonic.regions <- exons(TxDb.Hsapiens.UCSC.hg19.knownGene) > # Find the overlaps between the genes and exons > findOverlaps(genic.regions, exonic.regions) Hits object with 270213 hits and 0 metadata columns: queryHits subjectHits <integer> <integer> [1] 1 250809 [2] 1 250810 [3] 1 250811 [4] 1 250812 [5] 1 250813 ... ... ... [270209] 23056 266961 [270210] 23056 266962 [270211] 23056 266963 [270212] 23056 266964 [270213] 23056 266965 ------- queryLength: 23056 subjectLength: 289969
As you can see, there are nearly 290000 exons, but only about 270000 overlap with any of the genes. I can see it very clearly, if I try to plot genes and exons overlapping a fragment of a chromosome. There's a few exons (marked by the green triangle) that do not appear to be part of any gene. So my question is, what might they be and how I should deal with them if, for instance, I'm trying to get coordinates of the intronic or intergenic regions?
I don't think your images are showing, if you have any.
Thanks, fixed it.