subsetting GRanges object based on gene IDs
0
1
Entering edit mode
Assa Yeroslaviz ★ 1.5k
@assa-yeroslaviz-1597
Last seen 7 weeks ago
Germany

Hi, I have a GRanges object e.g.

GRanges object with 6 ranges and 1 metadata column:
           seqnames         ranges strand |     gene_id
              <Rle>      <IRanges>  <Rle> | <character>
  15S_rRNA       MT [ 6546,  8194]      + |    15S_rRNA
  21S_rRNA       MT [58009, 62447]      + |    21S_rRNA
     Q0010       MT [ 3952,  4338]      + |       Q0010
     Q0017       MT [ 4254,  4415]      + |       Q0017
     Q0032       MT [11667, 11957]      + |       Q0032
     Q0045       MT [13818, 26701]      + |       Q0045
  -------
  seqinfo: 17 sequences (1 circular) from an unspecified genome; no seqlengths

Now I would like to extract from it only specific rows. Lets say I want to have the genes  Q0017 and  15S_rRNA.

I have tried to do it iwht the subset command, but it only takes one pattern

subset(genes.mod, gene_id =="Q0017" )

Is there a way to add multiple patterns w.o. doing it multiple times for each gene?

thanks Assa

 

 

genomicranges genomicfeatures granges subsetting • 4.4k views
ADD COMMENT
2
Entering edit mode

Did you consider using %in% instead of ==?

ADD REPLY
0
Entering edit mode

Thanks a lot. this is doing the job perfectly. 

How embarrassing.

ADD REPLY
1
Entering edit mode

Hi Assa,

Don't worry about embarrassment.  You asked an excellent question about programming efficiently that's also useful to others reading the support site.

As far as resources go, %in% is not discussed in the GenomicRanges  introduction vignette, though it is mentioned in man page examples and the howto. You might also find the Advanced R book quite useful.

ADD REPLY

Login before adding your answer.

Traffic: 343 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6