Collapse/Reduce in GRanges by Metadata Column
1
0
Entering edit mode
nhejazi • 0
@nhejazi-10825
Last seen 5.2 years ago
UC Berkeley

I've only recently adopted GRanges for problems like the one below; my apologies in advance if this or similar questions have been answered many times already (I couldn't find anything in various searches...)

I have a GRanges structure like the one below. Using values in the metadata column X, I'd like to collapse `seqnames` and `ranges` information, such that, for all genomic ranges (really, positions) in this table, a new GRanges structure is generated where the `ranges` value starts with the first position and ends with the last position for all original ranges falling under a single bin in the metadata column. For example, referring to the table below, for X == 1, the new GRanges table that I wish to generate should have a single entry with `seqnames` = 16 and ranges = [60299, 60370].

I can think of various ways to do this with data.frame, but I'd rather avoid the inefficiency of doing that then coercing the resultant structure back to GRanges. Is there an efficient way to do this?

GRanges object with 1507062 ranges and 1 metadata column:
            seqnames         ranges strand |         X
               <Rle>      <IRanges>  <Rle> | <numeric>
        [1]       16 [60299, 60299]      * |         1
        [2]       16 [60304, 60304]      * |         1
        [3]       16 [60336, 60336]      * |         1
        [4]       16 [60370, 60370]      * |         1
        [5]       16 [61024, 61024]      * |         2
        ...      ...            ...    ... .       ...
  [1507058]       MT [16412, 16412]      * |    240330
  [1507059]       MT [16428, 16428]      * |    240330
  [1507060]       MT [16450, 16450]      * |    240330
  [1507061]       MT [16495, 16495]      * |    240330
  [1507062]       MT [16542, 16542]      * |    240330
  -------
  seqinfo: 84 sequences from an unspecified genome; no seqlengths

granges merge reduce • 3.5k views
ADD COMMENT
4
Entering edit mode
@michael-lawrence-3846
Last seen 3.0 years ago
United States

unlist(range(split(gr, ~X)))

ADD COMMENT

Login before adding your answer.

Traffic: 930 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6