using disjoin() for copy number as Rle() columns of a SummarizedExperiment

0

Entering edit mode

Tim Triche ★ 4.2k

@tim-triche-3561

Last seen 4.4 years ago

United States

I have a pile of copy number array results that have been segmented and assigned regional p-values by GISTIC. (I have piles of other data that have all been converted to SummarizedExperiments now, because SEs rule). I'd like to slot them in with the rest of the data in a container that automatically "packs out" missing assays/features across SummarizedExperiments by adding NAs, because that allows me to query the whole kit and kaboodle using an arbitrary range or ranges. In order to not have the CNV results be obscenely large, I'd like to take all of the GRanges of each patient's segmented results, and use disjoin() to get a minimal representation that I can then query by feature/range/whatever for overlapping mutations, aberrations, or what have you, against the other data on the patients. This was mostly Sean Davis' idea but I only just recently figured out how it could make sense for my application (yes I know that is pathetic). Anyways... To be concrete, I tried the following without success: CNV.ranges <- do.call(disjoin, CNV.GRL) CNV.ranges <- disjoin(CNV.GRL) After thinking about it a bit more, I tried CNV.ranges <- disjoin(unlist(CNV.GRL)) and that seemed to be the correct magic. However, Reduce(sum, sapply(CNV.GRL, length)) ## [1] 438762 versus length(CNV.ranges) ## [1] 352036 Does this seem right? Also, once I've done this, what's the most efficient way to turn all the patients' results into Rle() columns against CNV.ranges? Thanks for any assistance, --t -- *A model is a lie that helps you see the truth.* * * Howard Skipper<http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> [[alternative HTML version deleted]]

• 737 views

ADD COMMENT • link updated 12.6 years ago by Michael Lawrence ★ 11k • written 12.6 years ago by Tim Triche ★ 4.2k

0

Entering edit mode

Michael Lawrence ★ 11k

@michael-lawrence-3846

Last seen 3.1 years ago

United States

On Thu, Jun 21, 2012 at 11:29 AM, Tim Triche, Jr. <tim.triche@gmail.com>wrote: > I have a pile of copy number array results that have been segmented and > assigned regional p-values by GISTIC. (I have piles of other data that > have all been converted to SummarizedExperiments now, because SEs rule). > I'd like to slot them in with the rest of the data in a container that > automatically "packs out" missing assays/features across > SummarizedExperiments by adding NAs, because that allows me to query the > whole kit and kaboodle using an arbitrary range or ranges. > > In order to not have the CNV results be obscenely large, I'd like to take > all of the GRanges of each patient's segmented results, and use disjoin() > to get a minimal representation that I can then query by > feature/range/whatever for overlapping mutations, aberrations, or what have > you, against the other data on the patients. This was mostly Sean Davis' > idea but I only just recently figured out how it could make sense for my > application (yes I know that is pathetic). Anyways... > > To be concrete, I tried the following without success: > > CNV.ranges <- do.call(disjoin, CNV.GRL) > CNV.ranges <- disjoin(CNV.GRL) > > After thinking about it a bit more, I tried > > CNV.ranges <- disjoin(unlist(CNV.GRL)) > > and that seemed to be the correct magic. However, > > Reduce(sum, sapply(CNV.GRL, length)) > ## [1] 438762 > > I think you wanted: sum(elementLengths(CNV.CRL))? The above is a cumsum, I think. > versus > > length(CNV.ranges) > ## [1] 352036 > > Does this seem right? Also, once I've done this, what's the most efficient > way to turn all the patients' results into Rle() columns against > CNV.ranges? > > > You would use findOverlaps to map the disjoined regions back to the original regions and then Rle() the resulting vectors. > Thanks for any assistance, > > --t > > > > > -- > *A model is a lie that helps you see the truth.* > * > * > Howard Skipper< > http://cancerres.aacrjournals.org/content/31/9/1173.full.pdf> > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]

ADD COMMENT • link 12.6 years ago Michael Lawrence ★ 11k

0

Entering edit mode

Perfect! Thanks much. And yes I think you're right about the way I used Reduce(). --t On Thu, Jun 21, 2012 at 7:27 PM, Michael Lawrence <lawrence.michael@gene.com> wrote > > > You would use findOverlaps to map the disjoined regions back to the > original regions and then Rle() the resulting vectors. > -- *A model is a lie that helps you see the truth.* * * Howard Skipper<http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> [[alternative HTML version deleted]]

ADD REPLY • link 12.6 years ago Tim Triche ★ 4.2k

Login before adding your answer.