using disjoin() for copy number as Rle() columns of a SummarizedExperiment
1
0
Entering edit mode
Tim Triche ★ 4.2k
@tim-triche-3561
Last seen 4.4 years ago
United States
I have a pile of copy number array results that have been segmented and assigned regional p-values by GISTIC. (I have piles of other data that have all been converted to SummarizedExperiments now, because SEs rule). I'd like to slot them in with the rest of the data in a container that automatically "packs out" missing assays/features across SummarizedExperiments by adding NAs, because that allows me to query the whole kit and kaboodle using an arbitrary range or ranges. In order to not have the CNV results be obscenely large, I'd like to take all of the GRanges of each patient's segmented results, and use disjoin() to get a minimal representation that I can then query by feature/range/whatever for overlapping mutations, aberrations, or what have you, against the other data on the patients. This was mostly Sean Davis' idea but I only just recently figured out how it could make sense for my application (yes I know that is pathetic). Anyways... To be concrete, I tried the following without success: CNV.ranges <- do.call(disjoin, CNV.GRL) CNV.ranges <- disjoin(CNV.GRL) After thinking about it a bit more, I tried CNV.ranges <- disjoin(unlist(CNV.GRL)) and that seemed to be the correct magic. However, Reduce(sum, sapply(CNV.GRL, length)) ## [1] 438762 versus length(CNV.ranges) ## [1] 352036 Does this seem right? Also, once I've done this, what's the most efficient way to turn all the patients' results into Rle() columns against CNV.ranges? Thanks for any assistance, --t -- *A model is a lie that helps you see the truth.* * * Howard Skipper<http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> [[alternative HTML version deleted]]
• 737 views
ADD COMMENT
0
Entering edit mode
@michael-lawrence-3846
Last seen 3.1 years ago
United States
On Thu, Jun 21, 2012 at 11:29 AM, Tim Triche, Jr. <tim.triche@gmail.com>wrote: > I have a pile of copy number array results that have been segmented and > assigned regional p-values by GISTIC. (I have piles of other data that > have all been converted to SummarizedExperiments now, because SEs rule). > I'd like to slot them in with the rest of the data in a container that > automatically "packs out" missing assays/features across > SummarizedExperiments by adding NAs, because that allows me to query the > whole kit and kaboodle using an arbitrary range or ranges. > > In order to not have the CNV results be obscenely large, I'd like to take > all of the GRanges of each patient's segmented results, and use disjoin() > to get a minimal representation that I can then query by > feature/range/whatever for overlapping mutations, aberrations, or what have > you, against the other data on the patients. This was mostly Sean Davis' > idea but I only just recently figured out how it could make sense for my > application (yes I know that is pathetic). Anyways... > > To be concrete, I tried the following without success: > > CNV.ranges <- do.call(disjoin, CNV.GRL) > CNV.ranges <- disjoin(CNV.GRL) > > After thinking about it a bit more, I tried > > CNV.ranges <- disjoin(unlist(CNV.GRL)) > > and that seemed to be the correct magic. However, > > Reduce(sum, sapply(CNV.GRL, length)) > ## [1] 438762 > > I think you wanted: sum(elementLengths(CNV.CRL))? The above is a cumsum, I think. > versus > > length(CNV.ranges) > ## [1] 352036 > > Does this seem right? Also, once I've done this, what's the most efficient > way to turn all the patients' results into Rle() columns against > CNV.ranges? > > > You would use findOverlaps to map the disjoined regions back to the original regions and then Rle() the resulting vectors. > Thanks for any assistance, > > --t > > > > > -- > *A model is a lie that helps you see the truth.* > * > * > Howard Skipper< > http://cancerres.aacrjournals.org/content/31/9/1173.full.pdf> > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
Perfect! Thanks much. And yes I think you're right about the way I used Reduce(). --t On Thu, Jun 21, 2012 at 7:27 PM, Michael Lawrence <lawrence.michael@gene.com> wrote > > > You would use findOverlaps to map the disjoined regions back to the > original regions and then Rle() the resulting vectors. > -- *A model is a lie that helps you see the truth.* * * Howard Skipper<http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 609 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6