calculate total transcript length from DisjointExons output (Genomic Ranges Object)
1
0
Entering edit mode
wd ▴ 30
@wd-7410
Last seen 4.7 years ago
Germany

Hi

I used the disjointExons function (from the GenomicFeatures package) to extract the non-overlapping exon parts from a TxDb object

disjoint_exons<-disjointExons(txdb,includeTranscripts=TRUE)

resulting in the following Granges object (yellow shaded)

GRanges object with 65726 ranges and 3 metadata columns:
             seqnames         ranges strand   |         gene_id         tx_name
                <Rle>      <IRanges>  <Rle>   | <CharacterList> <CharacterList>
      [1]  scaffold_1 [ 6883,  7179]      +   |   tetur01g00010 tetur01g00010.1
      [2]  scaffold_1 [ 7271,  7355]      +   |   tetur01g00010 tetur01g00010.1
      [3]  scaffold_1 [18549, 18719]      +   |   tetur01g00010 tetur01g00010.1
      [4]  scaffold_1 [21717, 22421]      +   |   tetur01g00010 tetur01g00010.1
      [5]  scaffold_1 [23525, 24190]      +   |   tetur01g00030 tetur01g00030.1
      ...         ...            ...    ... ...             ...             ...
  [65722] scaffold_99 [ 8664,  8967]      -   |   tetur99g00020 tetur99g00020.1

exonic_part
            <integer>
      [1]           1
      [2]           2
      [3]           3
      [4]           4
      [5]           1
      ...         ...
  [65722]           2

Based on this Granges object I would like to calculate the total transcript length for each gene_id in elementMetadata (teturxxgxxxx)?

I tried to convert the Granges object to a txdb object (using makeTxDbFromGRanges) in order to use transcriptLength but that gave an error (Error in .get_type(gr_mcols) : 'gr' must have a "type" metadata column)

Any help would be much appreciated!

 

 

genomicfeatures non overlapping exons genomicranges granges • 1.4k views
ADD COMMENT
1
Entering edit mode
@michael-lawrence-3846
Last seen 3.0 years ago
United States

If the exonic length of every gene is all you care about, then it is easier to just:

sum(width(reduce(exonsBy(txdb, "gene"))))
ADD COMMENT
0
Entering edit mode

Thank you Michael for the quick reply.

However, I would like to have the exonic length of every gene, but excluding overlap of exons with neighbouring genes.

Wannes

ADD REPLY
0
Entering edit mode

Ok, then you just need to split the widths by gene (after dropping to a vector) and sum them:

with(disjoint_exons, sum(splitAsList(width, drop(gene_id))))

That is probably the fastest way, but there are others, like this:

with(disjoint_exons, xtabs(width ~ drop(gene_id))

Also these now work in devel:

aggregate(disjoint_exons, width ~ drop(gene_id), sum)
xtabs(width ~ drop(gene_id), ex)
ADD REPLY
0
Entering edit mode

Thank you, that worked!

ADD REPLY

Login before adding your answer.

Traffic: 603 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6