Question

Collapse/Reduce in GRanges by Metadata Column

0

Entering edit mode

nhejazi • 0

@nhejazi-10825

Last seen 5.5 years ago

UC Berkeley

I've only recently adopted GRanges for problems like the one below; my apologies in advance if this or similar questions have been answered many times already (I couldn't find anything in various searches...)

I have a GRanges structure like the one below. Using values in the metadata column X, I'd like to collapse `seqnames` and `ranges` information, such that, for all genomic ranges (really, positions) in this table, a new GRanges structure is generated where the `ranges` value starts with the first position and ends with the last position for all original ranges falling under a single bin in the metadata column. For example, referring to the table below, for X == 1, the new GRanges table that I wish to generate should have a single entry with `seqnames` = 16 and ranges = [60299, 60370].

I can think of various ways to do this with data.frame, but I'd rather avoid the inefficiency of doing that then coercing the resultant structure back to GRanges. Is there an efficient way to do this?

GRanges object with 1507062 ranges and 1 metadata column:
seqnames ranges strand | X
<Rle> <IRanges> <Rle> | <numeric>
[1] 16 [60299, 60299] * | 1
[2] 16 [60304, 60304] * | 1
[3] 16 [60336, 60336] * | 1
[4] 16 [60370, 60370] * | 1
[5] 16 [61024, 61024] * | 2
... ... ... ... . ...
[1507058] MT [16412, 16412] * | 240330
[1507059] MT [16428, 16428] * | 240330
[1507060] MT [16450, 16450] * | 240330
[1507061] MT [16495, 16495] * | 240330
[1507062] MT [16542, 16542] * | 240330
-------
seqinfo: 84 sequences from an unspecified genome; no seqlengths

granges merge reduce • 3.7k views

ADD COMMENT • link updated 8.7 years ago by Michael Lawrence ★ 11k • written 8.7 years ago by nhejazi • 0

score 4 · Accepted Answer · 2016-06-28

4

Entering edit mode

Michael Lawrence ★ 11k

@michael-lawrence-3846

Last seen 3.2 years ago

United States

unlist(range(split(gr, ~X)))

ADD COMMENT • link 8.7 years ago Michael Lawrence ★ 11k