Hi
I used the disjointExons function (from the GenomicFeatures package) to extract the non-overlapping exon parts from a TxDb object
disjoint_exons<-disjointExons(txdb,includeTranscripts=TRUE)
resulting in the following Granges object (yellow shaded)
GRanges object with 65726 ranges and 3 metadata columns:
seqnames ranges strand | gene_id tx_name
<Rle> <IRanges> <Rle> | <CharacterList> <CharacterList>
[1] scaffold_1 [ 6883, 7179] + | tetur01g00010 tetur01g00010.1
[2] scaffold_1 [ 7271, 7355] + | tetur01g00010 tetur01g00010.1
[3] scaffold_1 [18549, 18719] + | tetur01g00010 tetur01g00010.1
[4] scaffold_1 [21717, 22421] + | tetur01g00010 tetur01g00010.1
[5] scaffold_1 [23525, 24190] + | tetur01g00030 tetur01g00030.1
... ... ... ... ... ... ...
[65722] scaffold_99 [ 8664, 8967] - | tetur99g00020 tetur99g00020.1
exonic_part
<integer>
[1] 1
[2] 2
[3] 3
[4] 4
[5] 1
... ...
[65722] 2
Based on this Granges object I would like to calculate the total transcript length for each gene_id in elementMetadata (teturxxgxxxx)?
I tried to convert the Granges object to a txdb object (using makeTxDbFromGRanges) in order to use transcriptLength but that gave an error (Error in .get_type(gr_mcols) : 'gr' must have a "type" metadata column)
Any help would be much appreciated!
Thank you Michael for the quick reply.
However, I would like to have the exonic length of every gene, but excluding overlap of exons with neighbouring genes.
Wannes
Ok, then you just need to split the widths by gene (after dropping to a vector) and sum them:
That is probably the fastest way, but there are others, like this:
Also these now work in devel:
Thank you, that worked!