subset GRanges object via ElementMetadata

0

Entering edit mode

Hermann Norpois ▴ 170

@hermann-norpois-5726

Last seen 10.0 years ago

Germany

Hello, I am looking for a method to subset a GRangesObject by means of values (or ElementMetadata column), for instance over==2. How does it work? Thanks Hermann > test.gr GRanges with 6 ranges and 3 metadata columns: seqnames ranges strand | edensity epeak over <rle> <iranges> <rle> | <integer> <integer> <integer> [1] chr1 [713844, 714487] * | 1000 256 1 [2] chr1 [762136, 763199] * | 1000 771 2 [3] chr1 [780124, 780289] * | 519 74 0 [4] chr1 [780533, 780677] * | 516 68 0 [5] chr1 [781104, 781387] * | 601 140 0 [6] chr1 [793830, 794396] * | 610 290 0 --- seqlengths: chr1 chr10 chr11 chr12 chr13 chr14 ... chr6 chr7 chr8 chr9 chrX chrY NA NA NA NA NA NA ... NA NA NA NA NA NA > dput test.gr) new("GRanges" , seqnames = new("Rle" , values = structure(1L, .Label = c("chr1", "chr10", "chr11", "chr12", "chr13", "chr14", "chr15", "chr16", "chr17", "chr18", "chr19", "chr2", "chr20", "chr21", "chr22", "chr3", "chr4", "chr5", "chr6", "chr7", "chr8", "chr9", "chrX", "chrY"), class = "factor") , lengths = 6L , elementMetadata = NULL , metadata = list() ) , ranges = new("IRanges" , start = c(713844L, 762136L, 780124L, 780533L, 781104L, 793830L) , width = c(644L, 1064L, 166L, 145L, 284L, 567L) , NAMES = NULL , elementType = "integer" , elementMetadata = NULL , metadata = list() ) , strand = new("Rle" , values = structure(3L, .Label = c("+", "-", "*"), class = "factor") , lengths = 6L , elementMetadata = NULL , metadata = list() ) , elementMetadata = new("DataFrame" , rownames = NULL , nrows = 6L , listData = structure(list(edensity = c(1000L, 1000L, 519L, 516L, 601L, 610L ), epeak = c(256L, 771L, 74L, 68L, 140L, 290L), over = c(1L, 2L, 0L, 0L, 0L, 0L)), .Names = c("edensity", "epeak", "over")) , elementType = "ANY" , elementMetadata = NULL , metadata = list() ) , seqinfo = new("Seqinfo" , seqnames = c("chr1", "chr10", "chr11", "chr12", "chr13", "chr14", "chr15", "chr16", "chr17", "chr18", "chr19", "chr2", "chr20", "chr21", "chr22", "chr3", "chr4", "chr5", "chr6", "chr7", "chr8", "chr9", "chrX", "chrY") , seqlengths = c(NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_) , is_circular = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA) , genome = c(NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_ ) ) , metadata = list() ) [[alternative HTML version deleted]]

• 8.1k views

ADD COMMENT • link updated 12.1 years ago by Arnaud Amzallag ▴ 100 • written 12.1 years ago by Hermann Norpois ▴ 170

0

Entering edit mode

Tim Triche ★ 4.2k

@tim-triche-3561

Last seen 4.6 years ago

United States

the shorthand method would be GR[ GR$over == 2 ] and in your example, R> test.gr GRanges with 6 ranges and 3 metadata columns: seqnames ranges strand | edensity epeak over <rle> <iranges> <rle> | <integer> <integer> <integer> [1] chr1 [713844, 714487] * | 1000 256 1 [2] chr1 [762136, 763199] * | 1000 771 2 [3] chr1 [780124, 780289] * | 519 74 0 [4] chr1 [780533, 780677] * | 516 68 0 [5] chr1 [781104, 781387] * | 601 140 0 [6] chr1 [793830, 794396] * | 610 290 0 --- seqlengths: chr1 chr10 chr11 chr12 chr13 chr14 ... chr6 chr7 chr8 chr9 chrX chrY NA NA NA NA NA NA ... NA NA NA NA NA NA R> test.gr[ test.gr$over == 2 ] GRanges with 1 range and 3 metadata columns: seqnames ranges strand | edensity epeak over <rle> <iranges> <rle> | <integer> <integer> <integer> [1] chr1 [762136, 763199] * | 1000 771 2 --- seqlengths: chr1 chr10 chr11 chr12 chr13 chr14 ... chr6 chr7 chr8 chr9 chrX chrY NA NA NA NA NA NA ... NA NA NA NA NA NA On Fri, Feb 22, 2013 at 7:33 AM, Hermann Norpois <hnorpois@gmail.com> wrote: > Hello, > > I am looking for a method to subset a GRangesObject by means of values (or > ElementMetadata column), for instance > over==2. > > How does it work? > > Thanks > Hermann > > > > test.gr > GRanges with 6 ranges and 3 metadata columns: > seqnames ranges strand | edensity epeak over > <rle> <iranges> <rle> | <integer> <integer> <integer> > [1] chr1 [713844, 714487] * | 1000 256 1 > [2] chr1 [762136, 763199] * | 1000 771 2 > [3] chr1 [780124, 780289] * | 519 74 0 > [4] chr1 [780533, 780677] * | 516 68 0 > [5] chr1 [781104, 781387] * | 601 140 0 > [6] chr1 [793830, 794396] * | 610 290 0 > --- > seqlengths: > chr1 chr10 chr11 chr12 chr13 chr14 ... chr6 chr7 chr8 chr9 chrX > chrY > NA NA NA NA NA NA ... NA NA NA NA NA > NA > > dput test.gr) > new("GRanges" > , seqnames = new("Rle" > , values = structure(1L, .Label = c("chr1", "chr10", "chr11", "chr12", > "chr13", > "chr14", "chr15", "chr16", "chr17", "chr18", "chr19", "chr2", > "chr20", "chr21", "chr22", "chr3", "chr4", "chr5", "chr6", "chr7", > "chr8", "chr9", "chrX", "chrY"), class = "factor") > , lengths = 6L > , elementMetadata = NULL > , metadata = list() > ) > , ranges = new("IRanges" > , start = c(713844L, 762136L, 780124L, 780533L, 781104L, 793830L) > , width = c(644L, 1064L, 166L, 145L, 284L, 567L) > , NAMES = NULL > , elementType = "integer" > , elementMetadata = NULL > , metadata = list() > ) > , strand = new("Rle" > , values = structure(3L, .Label = c("+", "-", "*"), class = "factor") > , lengths = 6L > , elementMetadata = NULL > , metadata = list() > ) > , elementMetadata = new("DataFrame" > , rownames = NULL > , nrows = 6L > , listData = structure(list(edensity = c(1000L, 1000L, 519L, 516L, > 601L, 610L > ), epeak = c(256L, 771L, 74L, 68L, 140L, 290L), over = c(1L, > 2L, 0L, 0L, 0L, 0L)), .Names = c("edensity", "epeak", "over")) > , elementType = "ANY" > , elementMetadata = NULL > , metadata = list() > ) > , seqinfo = new("Seqinfo" > , seqnames = c("chr1", "chr10", "chr11", "chr12", "chr13", "chr14", > "chr15", > "chr16", "chr17", "chr18", "chr19", "chr2", "chr20", "chr21", > "chr22", "chr3", "chr4", "chr5", "chr6", "chr7", "chr8", "chr9", > "chrX", "chrY") > , seqlengths = c(NA_integer_, NA_integer_, NA_integer_, NA_integer_, > NA_integer_, > NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, > NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, > NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, > NA_integer_, NA_integer_, NA_integer_, NA_integer_) > , is_circular = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, > NA, NA, > NA, NA, NA, NA, NA, NA, NA, NA, NA) > , genome = c(NA_character_, NA_character_, NA_character_, > NA_character_, > NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, > NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, > NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, > NA_character_, NA_character_, NA_character_, NA_character_, NA_character_ > ) > ) > , metadata = list() > ) > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > -- *A model is a lie that helps you see the truth.* * * Howard Skipper<http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> [[alternative HTML version deleted]]

ADD COMMENT • link 12.1 years ago Tim Triche ★ 4.2k

0

Entering edit mode

Btw, I hacked together a subset() method for GenomicRanges yesterday. It respects the metadata columns. Someone could probably come up with some reason why that violates the conceptual foundations of something, but I find it useful. So you could do: subset(gr, over == 2) Will commit shortly. Michael On Fri, Feb 22, 2013 at 10:10 AM, Tim Triche, Jr. <tim.triche@gmail.com>wrote: > the shorthand method would be > > GR[ GR$over == 2 ] > > and in your example, > > R> test.gr > GRanges with 6 ranges and 3 metadata columns: > seqnames ranges strand | edensity epeak over > <rle> <iranges> <rle> | <integer> <integer> <integer> > [1] chr1 [713844, 714487] * | 1000 256 1 > [2] chr1 [762136, 763199] * | 1000 771 2 > [3] chr1 [780124, 780289] * | 519 74 0 > [4] chr1 [780533, 780677] * | 516 68 0 > [5] chr1 [781104, 781387] * | 601 140 0 > [6] chr1 [793830, 794396] * | 610 290 0 > --- > seqlengths: > chr1 chr10 chr11 chr12 chr13 chr14 ... chr6 chr7 chr8 chr9 chrX > chrY > NA NA NA NA NA NA ... NA NA NA NA NA > NA > R> test.gr[ test.gr$over == 2 ] > GRanges with 1 range and 3 metadata columns: > seqnames ranges strand | edensity epeak over > <rle> <iranges> <rle> | <integer> <integer> <integer> > [1] chr1 [762136, 763199] * | 1000 771 2 > --- > seqlengths: > chr1 chr10 chr11 chr12 chr13 chr14 ... chr6 chr7 chr8 chr9 chrX > chrY > NA NA NA NA NA NA ... NA NA NA NA NA > NA > > > > > On Fri, Feb 22, 2013 at 7:33 AM, Hermann Norpois <hnorpois@gmail.com> > wrote: > > > Hello, > > > > I am looking for a method to subset a GRangesObject by means of values > (or > > ElementMetadata column), for instance > > over==2. > > > > How does it work? > > > > Thanks > > Hermann > > > > > > > test.gr > > GRanges with 6 ranges and 3 metadata columns: > > seqnames ranges strand | edensity epeak over > > <rle> <iranges> <rle> | <integer> <integer> <integer> > > [1] chr1 [713844, 714487] * | 1000 256 1 > > [2] chr1 [762136, 763199] * | 1000 771 2 > > [3] chr1 [780124, 780289] * | 519 74 0 > > [4] chr1 [780533, 780677] * | 516 68 0 > > [5] chr1 [781104, 781387] * | 601 140 0 > > [6] chr1 [793830, 794396] * | 610 290 0 > > --- > > seqlengths: > > chr1 chr10 chr11 chr12 chr13 chr14 ... chr6 chr7 chr8 chr9 chrX > > chrY > > NA NA NA NA NA NA ... NA NA NA NA NA > > NA > > > dput test.gr) > > new("GRanges" > > , seqnames = new("Rle" > > , values = structure(1L, .Label = c("chr1", "chr10", "chr11", > "chr12", > > "chr13", > > "chr14", "chr15", "chr16", "chr17", "chr18", "chr19", "chr2", > > "chr20", "chr21", "chr22", "chr3", "chr4", "chr5", "chr6", "chr7", > > "chr8", "chr9", "chrX", "chrY"), class = "factor") > > , lengths = 6L > > , elementMetadata = NULL > > , metadata = list() > > ) > > , ranges = new("IRanges" > > , start = c(713844L, 762136L, 780124L, 780533L, 781104L, 793830L) > > , width = c(644L, 1064L, 166L, 145L, 284L, 567L) > > , NAMES = NULL > > , elementType = "integer" > > , elementMetadata = NULL > > , metadata = list() > > ) > > , strand = new("Rle" > > , values = structure(3L, .Label = c("+", "-", "*"), class = "factor") > > , lengths = 6L > > , elementMetadata = NULL > > , metadata = list() > > ) > > , elementMetadata = new("DataFrame" > > , rownames = NULL > > , nrows = 6L > > , listData = structure(list(edensity = c(1000L, 1000L, 519L, 516L, > > 601L, 610L > > ), epeak = c(256L, 771L, 74L, 68L, 140L, 290L), over = c(1L, > > 2L, 0L, 0L, 0L, 0L)), .Names = c("edensity", "epeak", "over")) > > , elementType = "ANY" > > , elementMetadata = NULL > > , metadata = list() > > ) > > , seqinfo = new("Seqinfo" > > , seqnames = c("chr1", "chr10", "chr11", "chr12", "chr13", "chr14", > > "chr15", > > "chr16", "chr17", "chr18", "chr19", "chr2", "chr20", "chr21", > > "chr22", "chr3", "chr4", "chr5", "chr6", "chr7", "chr8", "chr9", > > "chrX", "chrY") > > , seqlengths = c(NA_integer_, NA_integer_, NA_integer_, NA_integer_, > > NA_integer_, > > NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, > > NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, > > NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, > > NA_integer_, NA_integer_, NA_integer_, NA_integer_) > > , is_circular = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, > > NA, NA, > > NA, NA, NA, NA, NA, NA, NA, NA, NA) > > , genome = c(NA_character_, NA_character_, NA_character_, > > NA_character_, > > NA_character_, NA_character_, NA_character_, NA_character_, > NA_character_, > > NA_character_, NA_character_, NA_character_, NA_character_, > NA_character_, > > NA_character_, NA_character_, NA_character_, NA_character_, > NA_character_, > > NA_character_, NA_character_, NA_character_, NA_character_, NA_character_ > > ) > > ) > > , metadata = list() > > ) > > > > [[alternative HTML version deleted]] > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@r-project.org > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > > > -- > *A model is a lie that helps you see the truth.* > * > * > Howard Skipper< > http://cancerres.aacrjournals.org/content/31/9/1173.full.pdf> > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]

ADD REPLY • link 12.1 years ago Michael Lawrence ★ 11k

0

Entering edit mode

On Fri, Feb 22, 2013 at 3:56 PM, Michael Lawrence <lawrence.michael at="" gene.com=""> wrote: > Btw, I hacked together a subset() method for GenomicRanges yesterday. It > respects the metadata columns. Someone could probably come up with some > reason why that violates the conceptual foundations of something, but I > find it useful. > > So you could do: > subset(gr, over == 2) > > Will commit shortly. Yeah! My `love-of-all-things-semantically-impure`-ing self has been wanting this one for a long time :-) http://thread.gmane.org/gmane.comp.lang.r.sequencing/1239 Thanks! -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact

ADD REPLY • link 12.1 years ago Steve Lianoglou ★ 13k

0

Entering edit mode

Hi Michael, On 02/22/2013 12:56 PM, Michael Lawrence wrote: > Btw, I hacked together a subset() method for GenomicRanges yesterday. It > respects the metadata columns. Someone could probably come up with some > reason why that violates the conceptual foundations of something, but I > find it useful. > > So you could do: > subset(gr, over == 2) Sounds good to me. Hopefully you set the method on Vector objects, rather than just GenomicRanges objects. Thanks, H. > > Will commit shortly. > > Michael > > > > > > On Fri, Feb 22, 2013 at 10:10 AM, Tim Triche, Jr. <tim.triche at="" gmail.com="">wrote: > >> the shorthand method would be >> >> GR[ GR$over == 2 ] >> >> and in your example, >> >> R> test.gr >> GRanges with 6 ranges and 3 metadata columns: >> seqnames ranges strand | edensity epeak over >> <rle> <iranges> <rle> | <integer> <integer> <integer> >> [1] chr1 [713844, 714487] * | 1000 256 1 >> [2] chr1 [762136, 763199] * | 1000 771 2 >> [3] chr1 [780124, 780289] * | 519 74 0 >> [4] chr1 [780533, 780677] * | 516 68 0 >> [5] chr1 [781104, 781387] * | 601 140 0 >> [6] chr1 [793830, 794396] * | 610 290 0 >> --- >> seqlengths: >> chr1 chr10 chr11 chr12 chr13 chr14 ... chr6 chr7 chr8 chr9 chrX >> chrY >> NA NA NA NA NA NA ... NA NA NA NA NA >> NA >> R> test.gr[ test.gr$over == 2 ] >> GRanges with 1 range and 3 metadata columns: >> seqnames ranges strand | edensity epeak over >> <rle> <iranges> <rle> | <integer> <integer> <integer> >> [1] chr1 [762136, 763199] * | 1000 771 2 >> --- >> seqlengths: >> chr1 chr10 chr11 chr12 chr13 chr14 ... chr6 chr7 chr8 chr9 chrX >> chrY >> NA NA NA NA NA NA ... NA NA NA NA NA >> NA >> >> >> >> >> On Fri, Feb 22, 2013 at 7:33 AM, Hermann Norpois <hnorpois at="" gmail.com=""> >> wrote: >> >>> Hello, >>> >>> I am looking for a method to subset a GRangesObject by means of values >> (or >>> ElementMetadata column), for instance >>> over==2. >>> >>> How does it work? >>> >>> Thanks >>> Hermann >>> >>> >>>> test.gr >>> GRanges with 6 ranges and 3 metadata columns: >>> seqnames ranges strand | edensity epeak over >>> <rle> <iranges> <rle> | <integer> <integer> <integer> >>> [1] chr1 [713844, 714487] * | 1000 256 1 >>> [2] chr1 [762136, 763199] * | 1000 771 2 >>> [3] chr1 [780124, 780289] * | 519 74 0 >>> [4] chr1 [780533, 780677] * | 516 68 0 >>> [5] chr1 [781104, 781387] * | 601 140 0 >>> [6] chr1 [793830, 794396] * | 610 290 0 >>> --- >>> seqlengths: >>> chr1 chr10 chr11 chr12 chr13 chr14 ... chr6 chr7 chr8 chr9 chrX >>> chrY >>> NA NA NA NA NA NA ... NA NA NA NA NA >>> NA >>>> dput test.gr) >>> new("GRanges" >>> , seqnames = new("Rle" >>> , values = structure(1L, .Label = c("chr1", "chr10", "chr11", >> "chr12", >>> "chr13", >>> "chr14", "chr15", "chr16", "chr17", "chr18", "chr19", "chr2", >>> "chr20", "chr21", "chr22", "chr3", "chr4", "chr5", "chr6", "chr7", >>> "chr8", "chr9", "chrX", "chrY"), class = "factor") >>> , lengths = 6L >>> , elementMetadata = NULL >>> , metadata = list() >>> ) >>> , ranges = new("IRanges" >>> , start = c(713844L, 762136L, 780124L, 780533L, 781104L, 793830L) >>> , width = c(644L, 1064L, 166L, 145L, 284L, 567L) >>> , NAMES = NULL >>> , elementType = "integer" >>> , elementMetadata = NULL >>> , metadata = list() >>> ) >>> , strand = new("Rle" >>> , values = structure(3L, .Label = c("+", "-", "*"), class = "factor") >>> , lengths = 6L >>> , elementMetadata = NULL >>> , metadata = list() >>> ) >>> , elementMetadata = new("DataFrame" >>> , rownames = NULL >>> , nrows = 6L >>> , listData = structure(list(edensity = c(1000L, 1000L, 519L, 516L, >>> 601L, 610L >>> ), epeak = c(256L, 771L, 74L, 68L, 140L, 290L), over = c(1L, >>> 2L, 0L, 0L, 0L, 0L)), .Names = c("edensity", "epeak", "over")) >>> , elementType = "ANY" >>> , elementMetadata = NULL >>> , metadata = list() >>> ) >>> , seqinfo = new("Seqinfo" >>> , seqnames = c("chr1", "chr10", "chr11", "chr12", "chr13", "chr14", >>> "chr15", >>> "chr16", "chr17", "chr18", "chr19", "chr2", "chr20", "chr21", >>> "chr22", "chr3", "chr4", "chr5", "chr6", "chr7", "chr8", "chr9", >>> "chrX", "chrY") >>> , seqlengths = c(NA_integer_, NA_integer_, NA_integer_, NA_integer_, >>> NA_integer_, >>> NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, >>> NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, >>> NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, >>> NA_integer_, NA_integer_, NA_integer_, NA_integer_) >>> , is_circular = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, >>> NA, NA, >>> NA, NA, NA, NA, NA, NA, NA, NA, NA) >>> , genome = c(NA_character_, NA_character_, NA_character_, >>> NA_character_, >>> NA_character_, NA_character_, NA_character_, NA_character_, >> NA_character_, >>> NA_character_, NA_character_, NA_character_, NA_character_, >> NA_character_, >>> NA_character_, NA_character_, NA_character_, NA_character_, >> NA_character_, >>> NA_character_, NA_character_, NA_character_, NA_character_, NA_character_ >>> ) >>> ) >>> , metadata = list() >>> ) >>> >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> >> >> >> -- >> *A model is a lie that helps you see the truth.* >> * >> * >> Howard Skipper< >> http://cancerres.aacrjournals.org/content/31/9/1173.full.pdf> >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319

ADD REPLY • link 12.1 years ago Hervé Pagès 16k

0

Entering edit mode

Hi Hervé, That's what I ended up doing, actually. One question that came up though is whether we want to support 2D subsetting of all (or at least most) Vector objects, in the same manner as GRanges. I think it would work, how about you? Michael On Fri, Feb 22, 2013 at 5:33 PM, Hervé Pagès <hpages@fhcrc.org> wrote: > Hi Michael, > > > On 02/22/2013 12:56 PM, Michael Lawrence wrote: > >> Btw, I hacked together a subset() method for GenomicRanges yesterday. It >> respects the metadata columns. Someone could probably come up with some >> reason why that violates the conceptual foundations of something, but I >> find it useful. >> >> So you could do: >> subset(gr, over == 2) >> > > Sounds good to me. Hopefully you set the method on Vector objects, > rather than just GenomicRanges objects. > > Thanks, > H. > > > >> Will commit shortly. >> >> Michael >> >> >> >> >> >> On Fri, Feb 22, 2013 at 10:10 AM, Tim Triche, Jr. <tim.triche@gmail.com>> >wrote: >> >> the shorthand method would be >>> >>> GR[ GR$over == 2 ] >>> >>> and in your example, >>> >>> R> test.gr >>> GRanges with 6 ranges and 3 metadata columns: >>> seqnames ranges strand | edensity epeak over >>> <rle> <iranges> <rle> | <integer> <integer> <integer> >>> [1] chr1 [713844, 714487] * | 1000 256 1 >>> [2] chr1 [762136, 763199] * | 1000 771 2 >>> [3] chr1 [780124, 780289] * | 519 74 0 >>> [4] chr1 [780533, 780677] * | 516 68 0 >>> [5] chr1 [781104, 781387] * | 601 140 0 >>> [6] chr1 [793830, 794396] * | 610 290 0 >>> --- >>> seqlengths: >>> chr1 chr10 chr11 chr12 chr13 chr14 ... chr6 chr7 chr8 chr9 chrX >>> chrY >>> NA NA NA NA NA NA ... NA NA NA NA NA >>> NA >>> R> test.gr[ test.gr$over == 2 ] >>> GRanges with 1 range and 3 metadata columns: >>> seqnames ranges strand | edensity epeak over >>> <rle> <iranges> <rle> | <integer> <integer> <integer> >>> [1] chr1 [762136, 763199] * | 1000 771 2 >>> --- >>> seqlengths: >>> chr1 chr10 chr11 chr12 chr13 chr14 ... chr6 chr7 chr8 chr9 chrX >>> chrY >>> NA NA NA NA NA NA ... NA NA NA NA NA >>> NA >>> >>> >>> >>> >>> On Fri, Feb 22, 2013 at 7:33 AM, Hermann Norpois <hnorpois@gmail.com> >>> wrote: >>> >>> Hello, >>>> >>>> I am looking for a method to subset a GRangesObject by means of values >>>> >>> (or >>> >>>> ElementMetadata column), for instance >>>> over==2. >>>> >>>> How does it work? >>>> >>>> Thanks >>>> Hermann >>>> >>>> >>>> test.gr >>>>> >>>> GRanges with 6 ranges and 3 metadata columns: >>>> seqnames ranges strand | edensity epeak over >>>> <rle> <iranges> <rle> | <integer> <integer> <integer> >>>> [1] chr1 [713844, 714487] * | 1000 256 1 >>>> [2] chr1 [762136, 763199] * | 1000 771 2 >>>> [3] chr1 [780124, 780289] * | 519 74 0 >>>> [4] chr1 [780533, 780677] * | 516 68 0 >>>> [5] chr1 [781104, 781387] * | 601 140 0 >>>> [6] chr1 [793830, 794396] * | 610 290 0 >>>> --- >>>> seqlengths: >>>> chr1 chr10 chr11 chr12 chr13 chr14 ... chr6 chr7 chr8 chr9 >>>> chrX >>>> chrY >>>> NA NA NA NA NA NA ... NA NA NA NA >>>> NA >>>> NA >>>> >>>>> dput test.gr) >>>>> >>>> new("GRanges" >>>> , seqnames = new("Rle" >>>> , values = structure(1L, .Label = c("chr1", "chr10", "chr11", >>>> >>> "chr12", >>> >>>> "chr13", >>>> "chr14", "chr15", "chr16", "chr17", "chr18", "chr19", "chr2", >>>> "chr20", "chr21", "chr22", "chr3", "chr4", "chr5", "chr6", "chr7", >>>> "chr8", "chr9", "chrX", "chrY"), class = "factor") >>>> , lengths = 6L >>>> , elementMetadata = NULL >>>> , metadata = list() >>>> ) >>>> , ranges = new("IRanges" >>>> , start = c(713844L, 762136L, 780124L, 780533L, 781104L, 793830L) >>>> , width = c(644L, 1064L, 166L, 145L, 284L, 567L) >>>> , NAMES = NULL >>>> , elementType = "integer" >>>> , elementMetadata = NULL >>>> , metadata = list() >>>> ) >>>> , strand = new("Rle" >>>> , values = structure(3L, .Label = c("+", "-", "*"), class = >>>> "factor") >>>> , lengths = 6L >>>> , elementMetadata = NULL >>>> , metadata = list() >>>> ) >>>> , elementMetadata = new("DataFrame" >>>> , rownames = NULL >>>> , nrows = 6L >>>> , listData = structure(list(edensity = c(1000L, 1000L, 519L, 516L, >>>> 601L, 610L >>>> ), epeak = c(256L, 771L, 74L, 68L, 140L, 290L), over = c(1L, >>>> 2L, 0L, 0L, 0L, 0L)), .Names = c("edensity", "epeak", "over")) >>>> , elementType = "ANY" >>>> , elementMetadata = NULL >>>> , metadata = list() >>>> ) >>>> , seqinfo = new("Seqinfo" >>>> , seqnames = c("chr1", "chr10", "chr11", "chr12", "chr13", "chr14", >>>> "chr15", >>>> "chr16", "chr17", "chr18", "chr19", "chr2", "chr20", "chr21", >>>> "chr22", "chr3", "chr4", "chr5", "chr6", "chr7", "chr8", "chr9", >>>> "chrX", "chrY") >>>> , seqlengths = c(NA_integer_, NA_integer_, NA_integer_, >>>> NA_integer_, >>>> NA_integer_, >>>> NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, >>>> NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, >>>> NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, >>>> NA_integer_, NA_integer_, NA_integer_, NA_integer_) >>>> , is_circular = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, >>>> NA, >>>> NA, NA, >>>> NA, NA, NA, NA, NA, NA, NA, NA, NA) >>>> , genome = c(NA_character_, NA_character_, NA_character_, >>>> NA_character_, >>>> NA_character_, NA_character_, NA_character_, NA_character_, >>>> >>> NA_character_, >>> >>>> NA_character_, NA_character_, NA_character_, NA_character_, >>>> >>> NA_character_, >>> >>>> NA_character_, NA_character_, NA_character_, NA_character_, >>>> >>> NA_character_, >>> >>>> NA_character_, NA_character_, NA_character_, NA_character_, >>>> NA_character_ >>>> ) >>>> ) >>>> , metadata = list() >>>> ) >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> ______________________________**_________________ >>>> Bioconductor mailing list >>>> Bioconductor@r-project.org >>>> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat="" .ethz.ch="" mailman="" listinfo="" bioconductor=""> >>>> Search the archives: >>>> http://news.gmane.org/gmane.**science.biology.informatics.**condu ctor<http: news.gmane.org="" gmane.science.biology.informatics.conductor=""> >>>> >>>> >>> >>> >>> -- >>> *A model is a lie that helps you see the truth.* >>> * >>> * >>> Howard Skipper< >>> http://cancerres.aacrjournals.**org/content/31/9/1173.full.pdf<htt p:="" cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> >>> **> >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________**_________________ >>> Bioconductor mailing list >>> Bioconductor@r-project.org >>> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.="" ethz.ch="" mailman="" listinfo="" bioconductor=""> >>> Search the archives: >>> http://news.gmane.org/gmane.**science.biology.informatics.**conduc tor<http: news.gmane.org="" gmane.science.biology.informatics.conductor=""> >>> >>> >> [[alternative HTML version deleted]] >> >> ______________________________**_________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.e="" thz.ch="" mailman="" listinfo="" bioconductor=""> >> Search the archives: http://news.gmane.org/gmane.** >> science.biology.informatics.**conductor<http: news.gmane.org="" gmane="" .science.biology.informatics.conductor=""> >> >> > -- > Hervé Pagès > > Program in Computational Biology > Division of Public Health Sciences > > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M1-B514 > P.O. Box 19024 > Seattle, WA 98109-1024 > > E-mail: hpages@fhcrc.org > Phone: (206) 667-5791 > Fax: (206) 667-1319 > [[alternative HTML version deleted]]

ADD REPLY • link 12.1 years ago Michael Lawrence ★ 11k

0

Entering edit mode

Hi Michael, On 02/23/2013 04:50 AM, Michael Lawrence wrote: > Hi Herv?, > > That's what I ended up doing, actually. One question that came up though > is whether we want to support 2D subsetting of all (or at least most) > Vector objects, in the same manner as GRanges. I think it would work, > how about you? If by 2D subsetting you're referring to gr[i,j], I'm opposed to it. I think it's a mistake to try to put the 2D *low-level* API on top of objects that are conceptually not 2D objects. The current situation where we have 2D subsetting already work on both GRanges and GRangesList objects but do different things is messy and tells me that we shouldn't have provided this in the first place. Sounds like the gr$foo story again. Hopefully gr$foo will remain a 1 time exception. I think subset() is already giving you something similar to the 2D subsetting right? H. > > Michael > > > On Fri, Feb 22, 2013 at 5:33 PM, Hervé Pagès <hpages at="" fhcrc.org=""> <mailto:hpages at="" fhcrc.org="">> wrote: > > Hi Michael, > > > On 02/22/2013 12:56 PM, Michael Lawrence wrote: > > Btw, I hacked together a subset() method for GenomicRanges > yesterday. It > respects the metadata columns. Someone could probably come up > with some > reason why that violates the conceptual foundations of > something, but I > find it useful. > > So you could do: > subset(gr, over == 2) > > > Sounds good to me. Hopefully you set the method on Vector objects, > rather than just GenomicRanges objects. > > Thanks, > H. > > > > Will commit shortly. > > Michael > > > > > > On Fri, Feb 22, 2013 at 10:10 AM, Tim Triche, Jr. > <tim.triche at="" gmail.com="" <mailto:tim.triche="" at="" gmail.com="">>wrote: > > the shorthand method would be > > GR[ GR$over == 2 ] > > and in your example, > > R> test.gr <http: test.gr=""> > GRanges with 6 ranges and 3 metadata columns: > seqnames ranges strand | edensity > epeak over > <rle> <iranges> <rle> | <integer> > <integer> <integer> > [1] chr1 [713844, 714487] * | 1000 > 256 1 > [2] chr1 [762136, 763199] * | 1000 > 771 2 > [3] chr1 [780124, 780289] * | 519 > 74 0 > [4] chr1 [780533, 780677] * | 516 > 68 0 > [5] chr1 [781104, 781387] * | 601 > 140 0 > [6] chr1 [793830, 794396] * | 610 > 290 0 > --- > seqlengths: > chr1 chr10 chr11 chr12 chr13 chr14 ... chr6 chr7 > chr8 chr9 chrX > chrY > NA NA NA NA NA NA ... NA NA > NA NA NA > NA > R> test.gr <http: test.gr="">[ test.gr <http: test.gr="">$over > == 2 ] > GRanges with 1 range and 3 metadata columns: > seqnames ranges strand | edensity > epeak over > <rle> <iranges> <rle> | <integer> > <integer> <integer> > [1] chr1 [762136, 763199] * | 1000 > 771 2 > --- > seqlengths: > chr1 chr10 chr11 chr12 chr13 chr14 ... chr6 chr7 > chr8 chr9 chrX > chrY > NA NA NA NA NA NA ... NA NA > NA NA NA > NA > > > > > On Fri, Feb 22, 2013 at 7:33 AM, Hermann Norpois > <hnorpois at="" gmail.com="" <mailto:hnorpois="" at="" gmail.com="">> > wrote: > > Hello, > > I am looking for a method to subset a GRangesObject by > means of values > > (or > > ElementMetadata column), for instance > over==2. > > How does it work? > > Thanks > Hermann > > > test.gr <http: test.gr=""> > > GRanges with 6 ranges and 3 metadata columns: > seqnames ranges strand | edensity > epeak over > <rle> <iranges> <rle> | <integer> > <integer> <integer> > [1] chr1 [713844, 714487] * | 1000 > 256 1 > [2] chr1 [762136, 763199] * | 1000 > 771 2 > [3] chr1 [780124, 780289] * | 519 > 74 0 > [4] chr1 [780533, 780677] * | 516 > 68 0 > [5] chr1 [781104, 781387] * | 601 > 140 0 > [6] chr1 [793830, 794396] * | 610 > 290 0 > --- > seqlengths: > chr1 chr10 chr11 chr12 chr13 chr14 ... chr6 chr7 > chr8 chr9 chrX > chrY > NA NA NA NA NA NA ... NA NA > NA NA NA > NA > > dput test.gr <http: test.gr="">) > > new("GRanges" > , seqnames = new("Rle" > , values = structure(1L, .Label = c("chr1", > "chr10", "chr11", > > "chr12", > > "chr13", > "chr14", "chr15", "chr16", "chr17", "chr18", "chr19", > "chr2", > "chr20", "chr21", "chr22", "chr3", "chr4", "chr5", > "chr6", "chr7", > "chr8", "chr9", "chrX", "chrY"), class = "factor") > , lengths = 6L > , elementMetadata = NULL > , metadata = list() > ) > , ranges = new("IRanges" > , start = c(713844L, 762136L, 780124L, 780533L, > 781104L, 793830L) > , width = c(644L, 1064L, 166L, 145L, 284L, 567L) > , NAMES = NULL > , elementType = "integer" > , elementMetadata = NULL > , metadata = list() > ) > , strand = new("Rle" > , values = structure(3L, .Label = c("+", "-", > "*"), class = "factor") > , lengths = 6L > , elementMetadata = NULL > , metadata = list() > ) > , elementMetadata = new("DataFrame" > , rownames = NULL > , nrows = 6L > , listData = structure(list(edensity = c(1000L, > 1000L, 519L, 516L, > 601L, 610L > ), epeak = c(256L, 771L, 74L, 68L, 140L, 290L), over = c(1L, > 2L, 0L, 0L, 0L, 0L)), .Names = c("edensity", "epeak", > "over")) > , elementType = "ANY" > , elementMetadata = NULL > , metadata = list() > ) > , seqinfo = new("Seqinfo" > , seqnames = c("chr1", "chr10", "chr11", "chr12", > "chr13", "chr14", > "chr15", > "chr16", "chr17", "chr18", "chr19", "chr2", "chr20", > "chr21", > "chr22", "chr3", "chr4", "chr5", "chr6", "chr7", "chr8", > "chr9", > "chrX", "chrY") > , seqlengths = c(NA_integer_, NA_integer_, > NA_integer_, NA_integer_, > NA_integer_, > NA_integer_, NA_integer_, NA_integer_, NA_integer_, > NA_integer_, > NA_integer_, NA_integer_, NA_integer_, NA_integer_, > NA_integer_, > NA_integer_, NA_integer_, NA_integer_, NA_integer_, > NA_integer_, > NA_integer_, NA_integer_, NA_integer_, NA_integer_) > , is_circular = c(NA, NA, NA, NA, NA, NA, NA, NA, > NA, NA, NA, NA, NA, > NA, NA, > NA, NA, NA, NA, NA, NA, NA, NA, NA) > , genome = c(NA_character_, NA_character_, > NA_character_, > NA_character_, > NA_character_, NA_character_, NA_character_, NA_character_, > > NA_character_, > > NA_character_, NA_character_, NA_character_, NA_character_, > > NA_character_, > > NA_character_, NA_character_, NA_character_, NA_character_, > > NA_character_, > > NA_character_, NA_character_, NA_character_, > NA_character_, NA_character_ > ) > ) > , metadata = list() > ) > > [[alternative HTML version deleted]] > > _________________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > <mailto:bioconductor at="" r-project.org=""> > https://stat.ethz.ch/mailman/__listinfo/bioconductor > <https: stat.ethz.ch="" mailman="" listinfo="" bioconductor=""> > Search the archives: > http://news.gmane.org/gmane.__science.biology.informatics.__conductor > <http: news.gmane.org="" gmane.science.biology.informatics.conductor=""> > > > > > -- > *A model is a lie that helps you see the truth.* > * > * > Howard Skipper< > http://cancerres.aacrjournals.__org/content/31/9/1173.full.pdf > <http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf="">__> > > [[alternative HTML version deleted]] > > _________________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org <mailto:bioconductor at="" r-project.org=""> > https://stat.ethz.ch/mailman/__listinfo/bioconductor > <https: stat.ethz.ch="" mailman="" listinfo="" bioconductor=""> > Search the archives: > http://news.gmane.org/gmane.__science.biology.informatics.__conductor > <http: news.gmane.org="" gmane.science.biology.informatics.conductor=""> > > > [[alternative HTML version deleted]] > > _________________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org <mailto:bioconductor at="" r-project.org=""> > https://stat.ethz.ch/mailman/__listinfo/bioconductor > <https: stat.ethz.ch="" mailman="" listinfo="" bioconductor=""> > Search the archives: > http://news.gmane.org/gmane.__science.biology.informatics.__conductor > <http: news.gmane.org="" gmane.science.biology.informatics.conductor=""> > > > -- > Hervé Pagès > > Program in Computational Biology > Division of Public Health Sciences > > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M1-B514 > P.O. Box 19024 > Seattle, WA 98109-1024 > > E-mail: hpages at fhcrc.org <mailto:hpages at="" fhcrc.org=""> > Phone: (206) 667-5791 <tel:%28206%29%20667-5791> > Fax: (206) 667-1319 <tel:%28206%29%20667-1319> > > -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319

ADD REPLY • link 12.1 years ago Hervé Pagès 16k

0

Entering edit mode

The [i,meta_j] syntax is probably not very useful (at least in my experience), so I wouldn't favor it to the extent of $meta_name. For now, subset(select=) is probably sufficient. Michael On Mon, Feb 25, 2013 at 12:00 AM, Hervé Pagès <hpages@fhcrc.org> wrote: > Hi Michael, > > > On 02/23/2013 04:50 AM, Michael Lawrence wrote: > >> Hi Hervé, >> >> That's what I ended up doing, actually. One question that came up though >> is whether we want to support 2D subsetting of all (or at least most) >> Vector objects, in the same manner as GRanges. I think it would work, >> how about you? >> > > If by 2D subsetting you're referring to gr[i,j], I'm opposed to it. > I think it's a mistake to try to put the 2D *low-level* API on top of > objects that are conceptually not 2D objects. The current situation > where we have 2D subsetting already work on both GRanges and > GRangesList objects but do different things is messy and tells me > that we shouldn't have provided this in the first place. > > Sounds like the gr$foo story again. Hopefully gr$foo will remain a 1 > time exception. > > I think subset() is already giving you something similar to the 2D > subsetting right? > > H. > > >> Michael >> >> >> On Fri, Feb 22, 2013 at 5:33 PM, Hervé Pagès <hpages@fhcrc.org>> <mailto:hpages@fhcrc.org>> wrote: >> >> Hi Michael, >> >> >> On 02/22/2013 12:56 PM, Michael Lawrence wrote: >> >> Btw, I hacked together a subset() method for GenomicRanges >> yesterday. It >> respects the metadata columns. Someone could probably come up >> with some >> reason why that violates the conceptual foundations of >> something, but I >> find it useful. >> >> So you could do: >> subset(gr, over == 2) >> >> >> Sounds good to me. Hopefully you set the method on Vector objects, >> rather than just GenomicRanges objects. >> >> Thanks, >> H. >> >> >> >> Will commit shortly. >> >> Michael >> >> >> >> >> >> On Fri, Feb 22, 2013 at 10:10 AM, Tim Triche, Jr. >> <tim.triche@gmail.com <mailto:tim.triche@gmail.com="">>**wrote: >> >> >> the shorthand method would be >> >> GR[ GR$over == 2 ] >> >> and in your example, >> >> R> test.gr <http: test.gr=""> >> >> GRanges with 6 ranges and 3 metadata columns: >> seqnames ranges strand | edensity >> epeak over >> <rle> <iranges> <rle> | <integer> >> <integer> <integer> >> [1] chr1 [713844, 714487] * | 1000 >> 256 1 >> [2] chr1 [762136, 763199] * | 1000 >> 771 2 >> [3] chr1 [780124, 780289] * | 519 >> 74 0 >> [4] chr1 [780533, 780677] * | 516 >> 68 0 >> [5] chr1 [781104, 781387] * | 601 >> 140 0 >> [6] chr1 [793830, 794396] * | 610 >> 290 0 >> --- >> seqlengths: >> chr1 chr10 chr11 chr12 chr13 chr14 ... chr6 chr7 >> chr8 chr9 chrX >> chrY >> NA NA NA NA NA NA ... NA NA >> NA NA NA >> NA >> R> test.gr <http: test.gr="">[ test.gr <http: test.gr="">$over >> >> == 2 ] >> GRanges with 1 range and 3 metadata columns: >> seqnames ranges strand | edensity >> epeak over >> <rle> <iranges> <rle> | <integer> >> <integer> <integer> >> [1] chr1 [762136, 763199] * | 1000 >> 771 2 >> --- >> seqlengths: >> chr1 chr10 chr11 chr12 chr13 chr14 ... chr6 chr7 >> chr8 chr9 chrX >> chrY >> NA NA NA NA NA NA ... NA NA >> NA NA NA >> NA >> >> >> >> >> On Fri, Feb 22, 2013 at 7:33 AM, Hermann Norpois >> <hnorpois@gmail.com <mailto:hnorpois@gmail.com="">> >> >> wrote: >> >> Hello, >> >> I am looking for a method to subset a GRangesObject by >> means of values >> >> (or >> >> ElementMetadata column), for instance >> over==2. >> >> How does it work? >> >> Thanks >> Hermann >> >> >> test.gr <http: test.gr=""> >> >> >> GRanges with 6 ranges and 3 metadata columns: >> seqnames ranges strand | edensity >> epeak over >> <rle> <iranges> <rle> | <integer> >> <integer> <integer> >> [1] chr1 [713844, 714487] * | 1000 >> 256 1 >> [2] chr1 [762136, 763199] * | 1000 >> 771 2 >> [3] chr1 [780124, 780289] * | 519 >> 74 0 >> [4] chr1 [780533, 780677] * | 516 >> 68 0 >> [5] chr1 [781104, 781387] * | 601 >> 140 0 >> [6] chr1 [793830, 794396] * | 610 >> 290 0 >> --- >> seqlengths: >> chr1 chr10 chr11 chr12 chr13 chr14 ... chr6 chr7 >> chr8 chr9 chrX >> chrY >> NA NA NA NA NA NA ... NA NA >> NA NA NA >> NA >> >> dput test.gr <http: test.gr="">) >> >> >> new("GRanges" >> , seqnames = new("Rle" >> , values = structure(1L, .Label = c("chr1", >> "chr10", "chr11", >> >> "chr12", >> >> "chr13", >> "chr14", "chr15", "chr16", "chr17", "chr18", "chr19", >> "chr2", >> "chr20", "chr21", "chr22", "chr3", "chr4", "chr5", >> "chr6", "chr7", >> "chr8", "chr9", "chrX", "chrY"), class = "factor") >> , lengths = 6L >> , elementMetadata = NULL >> , metadata = list() >> ) >> , ranges = new("IRanges" >> , start = c(713844L, 762136L, 780124L, 780533L, >> 781104L, 793830L) >> , width = c(644L, 1064L, 166L, 145L, 284L, 567L) >> , NAMES = NULL >> , elementType = "integer" >> , elementMetadata = NULL >> , metadata = list() >> ) >> , strand = new("Rle" >> , values = structure(3L, .Label = c("+", "-", >> "*"), class = "factor") >> , lengths = 6L >> , elementMetadata = NULL >> , metadata = list() >> ) >> , elementMetadata = new("DataFrame" >> , rownames = NULL >> , nrows = 6L >> , listData = structure(list(edensity = c(1000L, >> 1000L, 519L, 516L, >> 601L, 610L >> ), epeak = c(256L, 771L, 74L, 68L, 140L, 290L), over = >> c(1L, >> 2L, 0L, 0L, 0L, 0L)), .Names = c("edensity", "epeak", >> "over")) >> , elementType = "ANY" >> , elementMetadata = NULL >> , metadata = list() >> ) >> , seqinfo = new("Seqinfo" >> , seqnames = c("chr1", "chr10", "chr11", "chr12", >> "chr13", "chr14", >> "chr15", >> "chr16", "chr17", "chr18", "chr19", "chr2", "chr20", >> "chr21", >> "chr22", "chr3", "chr4", "chr5", "chr6", "chr7", "chr8", >> "chr9", >> "chrX", "chrY") >> , seqlengths = c(NA_integer_, NA_integer_, >> NA_integer_, NA_integer_, >> NA_integer_, >> NA_integer_, NA_integer_, NA_integer_, NA_integer_, >> NA_integer_, >> NA_integer_, NA_integer_, NA_integer_, NA_integer_, >> NA_integer_, >> NA_integer_, NA_integer_, NA_integer_, NA_integer_, >> NA_integer_, >> NA_integer_, NA_integer_, NA_integer_, NA_integer_) >> , is_circular = c(NA, NA, NA, NA, NA, NA, NA, NA, >> NA, NA, NA, NA, NA, >> NA, NA, >> NA, NA, NA, NA, NA, NA, NA, NA, NA) >> , genome = c(NA_character_, NA_character_, >> NA_character_, >> NA_character_, >> NA_character_, NA_character_, NA_character_, >> NA_character_, >> >> NA_character_, >> >> NA_character_, NA_character_, NA_character_, >> NA_character_, >> >> NA_character_, >> >> NA_character_, NA_character_, NA_character_, >> NA_character_, >> >> NA_character_, >> >> NA_character_, NA_character_, NA_character_, >> NA_character_, NA_character_ >> ) >> ) >> , metadata = list() >> ) >> >> [[alternative HTML version deleted]] >> >> ______________________________**___________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> <mailto:bioconductor@r-**project.org<bioconductor@r-project.org> >> > >> https://stat.ethz.ch/mailman/_**_listinfo/bioconduc tor<https: stat.ethz.ch="" mailman="" __listinfo="" bioconductor=""> >> >> <https: stat.ethz.ch="" mailman="" **listinfo="" bioconduct="" or<https:="" stat.ethz.ch="" mailman="" listinfo="" bioconductor=""> >> > >> Search the archives: >> http://news.gmane.org/gmane.__** >> science.biology.informatics.__**conductor<http: news.gmane.org="" gma="" ne.__science.biology.informatics.__conductor=""> >> >> <http: news.gmane.org="" gmane.**="">> science.biology.informatics.**conductor<http: news.gmane.org="" gmane="" .science.biology.informatics.conductor=""> >> > >> >> >> >> >> -- >> *A model is a lie that helps you see the truth.* >> * >> * >> Howard Skipper< >> http://cancerres.aacrjournals.**__org/content/31/9/1173.full. >> **pdf >> <http: cancerres.**aacrjournals.org="" content="" 31="" 9="" **="">> 1173.full.pdf<http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.f="" ull.pdf=""> >> >__> >> >> [[alternative HTML version deleted]] >> >> ______________________________**___________________ >> Bioconductor mailing list >> Bioconductor@r-project.org <mailto:bioconductor@r-**>> project.org <bioconductor@r-project.org>> >> https://stat.ethz.ch/mailman/_**_listinfo/bioconductor< https://stat.ethz.ch/mailman/__listinfo/bioconductor> >> >> <https: stat.ethz.ch="" mailman="" **listinfo="" bioconductor<h="" ttps:="" stat.ethz.ch="" mailman="" listinfo="" bioconductor=""> >> > >> Search the archives: >> http://news.gmane.org/gmane.__** >> science.biology.informatics.__**conductor<http: news.gmane.org="" gma="" ne.__science.biology.informatics.__conductor=""> >> >> <http: news.gmane.org="" gmane.**science.biology.informatics.**="">> conductor<http: news.gmane.org="" gmane.science.biology.informatics.c="" onductor=""> >> > >> >> >> [[alternative HTML version deleted]] >> >> ______________________________**___________________ >> Bioconductor mailing list >> Bioconductor@r-project.org <mailto:bioconductor@r-**project.org<bioconductor@r-project.org> >> > >> https://stat.ethz.ch/mailman/_**_listinfo/bioconductor<http s:="" stat.ethz.ch="" mailman="" __listinfo="" bioconductor=""> >> >> <https: stat.ethz.ch="" mailman="" **listinfo="" bioconductor<https="" :="" stat.ethz.ch="" mailman="" listinfo="" bioconductor=""> >> > >> Search the archives: >> http://news.gmane.org/gmane.__**science.biology.informatics.__** >> conductor<http: news.gmane.org="" gmane.__science.biology.informatics="" .__conductor=""> >> >> <http: news.gmane.org="" gmane.**science.biology.informatics.**="">> conductor<http: news.gmane.org="" gmane.science.biology.informatics.c="" onductor=""> >> > >> >> >> -- >> Hervé Pagès >> >> Program in Computational Biology >> Division of Public Health Sciences >> >> Fred Hutchinson Cancer Research Center >> 1100 Fairview Ave. N, M1-B514 >> P.O. Box 19024 >> Seattle, WA 98109-1024 >> >> E-mail: hpages@fhcrc.org <mailto:hpages@fhcrc.org> >> Phone: (206) 667-5791 <tel:%28206%29%20667-5791> >> Fax: (206) 667-1319 <tel:%28206%29%20667-1319> >> >> >> > -- > Hervé Pagès > > Program in Computational Biology > Division of Public Health Sciences > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M1-B514 > P.O. Box 19024 > Seattle, WA 98109-1024 > > E-mail: hpages@fhcrc.org > Phone: (206) 667-5791 > Fax: (206) 667-1319 > [[alternative HTML version deleted]]

ADD REPLY • link 12.1 years ago Michael Lawrence ★ 11k

0

Entering edit mode

Arnaud Amzallag ▴ 100

@arnaud-amzallag-4471

Last seen 8.1 years ago

test.gr[valuestest.gr)$over %in% 2] works. test.gr[valuestest.gr)$over == 2] works too if over does not contains NAs. Arnaud On Feb 22, 2013, at 10:33 AM, Hermann Norpois wrote: > Hello, > > I am looking for a method to subset a GRangesObject by means of values (or > ElementMetadata column), for instance > over==2. > > How does it work? > > Thanks > Hermann > > >> test.gr > GRanges with 6 ranges and 3 metadata columns: > seqnames ranges strand | edensity epeak over > <rle> <iranges> <rle> | <integer> <integer> <integer> > [1] chr1 [713844, 714487] * | 1000 256 1 > [2] chr1 [762136, 763199] * | 1000 771 2 > [3] chr1 [780124, 780289] * | 519 74 0 > [4] chr1 [780533, 780677] * | 516 68 0 > [5] chr1 [781104, 781387] * | 601 140 0 > [6] chr1 [793830, 794396] * | 610 290 0 > --- > seqlengths: > chr1 chr10 chr11 chr12 chr13 chr14 ... chr6 chr7 chr8 chr9 chrX > chrY > NA NA NA NA NA NA ... NA NA NA NA NA > NA >> dput test.gr) > new("GRanges" > , seqnames = new("Rle" > , values = structure(1L, .Label = c("chr1", "chr10", "chr11", "chr12", > "chr13", > "chr14", "chr15", "chr16", "chr17", "chr18", "chr19", "chr2", > "chr20", "chr21", "chr22", "chr3", "chr4", "chr5", "chr6", "chr7", > "chr8", "chr9", "chrX", "chrY"), class = "factor") > , lengths = 6L > , elementMetadata = NULL > , metadata = list() > ) > , ranges = new("IRanges" > , start = c(713844L, 762136L, 780124L, 780533L, 781104L, 793830L) > , width = c(644L, 1064L, 166L, 145L, 284L, 567L) > , NAMES = NULL > , elementType = "integer" > , elementMetadata = NULL > , metadata = list() > ) > , strand = new("Rle" > , values = structure(3L, .Label = c("+", "-", "*"), class = "factor") > , lengths = 6L > , elementMetadata = NULL > , metadata = list() > ) > , elementMetadata = new("DataFrame" > , rownames = NULL > , nrows = 6L > , listData = structure(list(edensity = c(1000L, 1000L, 519L, 516L, > 601L, 610L > ), epeak = c(256L, 771L, 74L, 68L, 140L, 290L), over = c(1L, > 2L, 0L, 0L, 0L, 0L)), .Names = c("edensity", "epeak", "over")) > , elementType = "ANY" > , elementMetadata = NULL > , metadata = list() > ) > , seqinfo = new("Seqinfo" > , seqnames = c("chr1", "chr10", "chr11", "chr12", "chr13", "chr14", > "chr15", > "chr16", "chr17", "chr18", "chr19", "chr2", "chr20", "chr21", > "chr22", "chr3", "chr4", "chr5", "chr6", "chr7", "chr8", "chr9", > "chrX", "chrY") > , seqlengths = c(NA_integer_, NA_integer_, NA_integer_, NA_integer_, > NA_integer_, > NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, > NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, > NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, > NA_integer_, NA_integer_, NA_integer_, NA_integer_) > , is_circular = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, > NA, NA, > NA, NA, NA, NA, NA, NA, NA, NA, NA) > , genome = c(NA_character_, NA_character_, NA_character_, > NA_character_, > NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, > NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, > NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, > NA_character_, NA_character_, NA_character_, NA_character_, NA_character_ > ) > ) > , metadata = list() > ) > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD COMMENT • link 12.1 years ago Arnaud Amzallag ▴ 100

0

Entering edit mode

That's odd... I added an NA and sure enough, it fails: R> test.gr[ test.gr$over == 2 ] Error in IRanges:::normalizeSingleBracketSubscript(i, x) : subscript contains NAs But which() works fine: R> test.gr[ whichtest.gr$over == 2) ] GRanges with 1 range and 3 metadata columns: seqnames ranges strand | edensity epeak over <rle> <iranges> <rle> | <integer> <integer> <integer> [1] chr1 [762136, 763199] * | 1000 771 2 --- I wonder if this is an easy fix, too? On Fri, Feb 22, 2013 at 2:26 PM, Arnaud Amzallag <arnaud.amzallag@gmail.com>wrote: > test.gr[valuestest.gr)$over %in% 2] > > works. > > test.gr[valuestest.gr)$over == 2] works too if over does not contains > NAs. > > Arnaud > > On Feb 22, 2013, at 10:33 AM, Hermann Norpois wrote: > > > Hello, > > > > I am looking for a method to subset a GRangesObject by means of values > (or > > ElementMetadata column), for instance > > over==2. > > > > How does it work? > > > > Thanks > > Hermann > > > > > >> test.gr > > GRanges with 6 ranges and 3 metadata columns: > > seqnames ranges strand | edensity epeak over > > <rle> <iranges> <rle> | <integer> <integer> <integer> > > [1] chr1 [713844, 714487] * | 1000 256 1 > > [2] chr1 [762136, 763199] * | 1000 771 2 > > [3] chr1 [780124, 780289] * | 519 74 0 > > [4] chr1 [780533, 780677] * | 516 68 0 > > [5] chr1 [781104, 781387] * | 601 140 0 > > [6] chr1 [793830, 794396] * | 610 290 0 > > --- > > seqlengths: > > chr1 chr10 chr11 chr12 chr13 chr14 ... chr6 chr7 chr8 chr9 chrX > > chrY > > NA NA NA NA NA NA ... NA NA NA NA NA > > NA > >> dput test.gr) > > new("GRanges" > > , seqnames = new("Rle" > > , values = structure(1L, .Label = c("chr1", "chr10", "chr11", "chr12", > > "chr13", > > "chr14", "chr15", "chr16", "chr17", "chr18", "chr19", "chr2", > > "chr20", "chr21", "chr22", "chr3", "chr4", "chr5", "chr6", "chr7", > > "chr8", "chr9", "chrX", "chrY"), class = "factor") > > , lengths = 6L > > , elementMetadata = NULL > > , metadata = list() > > ) > > , ranges = new("IRanges" > > , start = c(713844L, 762136L, 780124L, 780533L, 781104L, 793830L) > > , width = c(644L, 1064L, 166L, 145L, 284L, 567L) > > , NAMES = NULL > > , elementType = "integer" > > , elementMetadata = NULL > > , metadata = list() > > ) > > , strand = new("Rle" > > , values = structure(3L, .Label = c("+", "-", "*"), class = "factor") > > , lengths = 6L > > , elementMetadata = NULL > > , metadata = list() > > ) > > , elementMetadata = new("DataFrame" > > , rownames = NULL > > , nrows = 6L > > , listData = structure(list(edensity = c(1000L, 1000L, 519L, 516L, > > 601L, 610L > > ), epeak = c(256L, 771L, 74L, 68L, 140L, 290L), over = c(1L, > > 2L, 0L, 0L, 0L, 0L)), .Names = c("edensity", "epeak", "over")) > > , elementType = "ANY" > > , elementMetadata = NULL > > , metadata = list() > > ) > > , seqinfo = new("Seqinfo" > > , seqnames = c("chr1", "chr10", "chr11", "chr12", "chr13", "chr14", > > "chr15", > > "chr16", "chr17", "chr18", "chr19", "chr2", "chr20", "chr21", > > "chr22", "chr3", "chr4", "chr5", "chr6", "chr7", "chr8", "chr9", > > "chrX", "chrY") > > , seqlengths = c(NA_integer_, NA_integer_, NA_integer_, NA_integer_, > > NA_integer_, > > NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, > > NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, > > NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, > > NA_integer_, NA_integer_, NA_integer_, NA_integer_) > > , is_circular = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, > > NA, NA, > > NA, NA, NA, NA, NA, NA, NA, NA, NA) > > , genome = c(NA_character_, NA_character_, NA_character_, > > NA_character_, > > NA_character_, NA_character_, NA_character_, NA_character_, > NA_character_, > > NA_character_, NA_character_, NA_character_, NA_character_, > NA_character_, > > NA_character_, NA_character_, NA_character_, NA_character_, > NA_character_, > > NA_character_, NA_character_, NA_character_, NA_character_, NA_character_ > > ) > > ) > > , metadata = list() > > ) > > > > [[alternative HTML version deleted]] > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@r-project.org > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > -- *A model is a lie that helps you see the truth.* * * Howard Skipper<http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> [[alternative HTML version deleted]]

ADD REPLY • link 12.1 years ago Tim Triche ★ 4.2k

0

Entering edit mode

On 02/22/2013 02:35 PM, Tim Triche, Jr. wrote: > That's odd... I added an NA and sure enough, it fails: > > R> test.gr[ test.gr$over == 2 ] > Error in IRanges:::normalizeSingleBracketSubscript(i, x) : > subscript contains NAs > > But which() works fine: > > R> test.gr[ whichtest.gr$over == 2) ] > GRanges with 1 range and 3 metadata columns: > seqnames ranges strand | edensity epeak over > <rle> <iranges> <rle> | <integer> <integer> <integer> > [1] chr1 [762136, 763199] * | 1000 771 2 > --- > > I wonder if this is an easy fix, too? In base R, subscripting with NA leads to > x = 1:5 > x[NA] [1] NA NA NA NA NA which makes a weird sense (recycling a length 1 NA) but I/GRanges don't support the notion of NA-ranges. So not implemented by design and hence not fixable is probably the answer. Martin > > > > > On Fri, Feb 22, 2013 at 2:26 PM, Arnaud Amzallag > <arnaud.amzallag at="" gmail.com="">wrote: > >> test.gr[valuestest.gr)$over %in% 2] >> >> works. >> >> test.gr[valuestest.gr)$over == 2] works too if over does not contains >> NAs. >> >> Arnaud >> >> On Feb 22, 2013, at 10:33 AM, Hermann Norpois wrote: >> >>> Hello, >>> >>> I am looking for a method to subset a GRangesObject by means of values >> (or >>> ElementMetadata column), for instance >>> over==2. >>> >>> How does it work? >>> >>> Thanks >>> Hermann >>> >>> >>>> test.gr >>> GRanges with 6 ranges and 3 metadata columns: >>> seqnames ranges strand | edensity epeak over >>> <rle> <iranges> <rle> | <integer> <integer> <integer> >>> [1] chr1 [713844, 714487] * | 1000 256 1 >>> [2] chr1 [762136, 763199] * | 1000 771 2 >>> [3] chr1 [780124, 780289] * | 519 74 0 >>> [4] chr1 [780533, 780677] * | 516 68 0 >>> [5] chr1 [781104, 781387] * | 601 140 0 >>> [6] chr1 [793830, 794396] * | 610 290 0 >>> --- >>> seqlengths: >>> chr1 chr10 chr11 chr12 chr13 chr14 ... chr6 chr7 chr8 chr9 chrX >>> chrY >>> NA NA NA NA NA NA ... NA NA NA NA NA >>> NA >>>> dput test.gr) >>> new("GRanges" >>> , seqnames = new("Rle" >>> , values = structure(1L, .Label = c("chr1", "chr10", "chr11", "chr12", >>> "chr13", >>> "chr14", "chr15", "chr16", "chr17", "chr18", "chr19", "chr2", >>> "chr20", "chr21", "chr22", "chr3", "chr4", "chr5", "chr6", "chr7", >>> "chr8", "chr9", "chrX", "chrY"), class = "factor") >>> , lengths = 6L >>> , elementMetadata = NULL >>> , metadata = list() >>> ) >>> , ranges = new("IRanges" >>> , start = c(713844L, 762136L, 780124L, 780533L, 781104L, 793830L) >>> , width = c(644L, 1064L, 166L, 145L, 284L, 567L) >>> , NAMES = NULL >>> , elementType = "integer" >>> , elementMetadata = NULL >>> , metadata = list() >>> ) >>> , strand = new("Rle" >>> , values = structure(3L, .Label = c("+", "-", "*"), class = "factor") >>> , lengths = 6L >>> , elementMetadata = NULL >>> , metadata = list() >>> ) >>> , elementMetadata = new("DataFrame" >>> , rownames = NULL >>> , nrows = 6L >>> , listData = structure(list(edensity = c(1000L, 1000L, 519L, 516L, >>> 601L, 610L >>> ), epeak = c(256L, 771L, 74L, 68L, 140L, 290L), over = c(1L, >>> 2L, 0L, 0L, 0L, 0L)), .Names = c("edensity", "epeak", "over")) >>> , elementType = "ANY" >>> , elementMetadata = NULL >>> , metadata = list() >>> ) >>> , seqinfo = new("Seqinfo" >>> , seqnames = c("chr1", "chr10", "chr11", "chr12", "chr13", "chr14", >>> "chr15", >>> "chr16", "chr17", "chr18", "chr19", "chr2", "chr20", "chr21", >>> "chr22", "chr3", "chr4", "chr5", "chr6", "chr7", "chr8", "chr9", >>> "chrX", "chrY") >>> , seqlengths = c(NA_integer_, NA_integer_, NA_integer_, NA_integer_, >>> NA_integer_, >>> NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, >>> NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, >>> NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, >>> NA_integer_, NA_integer_, NA_integer_, NA_integer_) >>> , is_circular = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, >>> NA, NA, >>> NA, NA, NA, NA, NA, NA, NA, NA, NA) >>> , genome = c(NA_character_, NA_character_, NA_character_, >>> NA_character_, >>> NA_character_, NA_character_, NA_character_, NA_character_, >> NA_character_, >>> NA_character_, NA_character_, NA_character_, NA_character_, >> NA_character_, >>> NA_character_, NA_character_, NA_character_, NA_character_, >> NA_character_, >>> NA_character_, NA_character_, NA_character_, NA_character_, NA_character_ >>> ) >>> ) >>> , metadata = list() >>> ) >>> >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > > -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793

ADD REPLY • link 12.1 years ago Martin Morgan 25k

Login before adding your answer.