Entering edit mode
swarmqq@gmail.com
▴
10
@swarmqqgmailcom-6572
Last seen 10.4 years ago
Thanks Tim for sharing this.
I think it is very useful. But I kind of stuck at generating
genomeRatioSet, it has error message.
*> gRatioSet <- mapToGenome(ratioSet, mergeManifest = TRUE)Error in
.Call2("solve_user_SEW0", start, end, width, PACKAGE = "IRanges") :
solving row 1: range cannot be determined from the supplied arguments
(too
many NAs)*I am not sure what's wrong? If you happen to know, could you
please share it with me?
Thanks a lot for the help.
Emma
On Saturday, February 1, 2014 2:06:59 PM UTC-5, Tim Triche, Jr. wrote:
>
> There are at least 3 different ways you could go about this; Tiffany
> Morris' probe variant annotation package accompanying ChAMP is
probably the
> most comprehensive, while (I claim) the UCSC common variant track is
the
> most trivial and the SNPlocs packages are the most intensive. Let's
assume
> your data is in a GenomicRatioSet named grSet for all examples.
>
> Writing and testing out the code for this took a little while so I
do hope
> you'll work through each example.
>
> All things considered, I'd recommend option #1, but it would be even
> better if the maintainers tweaked a few settings so that
> bsseq::data.frame2GRangesprobe.450K.VCs.af) was more trivial.
(Also:
> consider merging $diVC.com1pop.F and $diVC.com1pop.R, making
everything
> that can reasonably be so boolean, etc... GRanges are so much easier
to
> work with than data.frames when dealing with genomic coordinates)
>
> You'll need to supply your own grSet, but if you didn't have one, I
don't
> imagine you'd have asked :-)
> I have provided the dimensions (rows x columns) of the filtered
grSets for
> comparison in each case. You will have to decide what degree of
> conservatism is most appropriate for your experiment. Read the
docs!
>
>
> 1) Use the ProbeVariants package alliuded to above. This is
> comprehensive, but a bit of a PITA.
>
> require(minfi)
> grSet
> ## class: GenomicRatioSet
> ## dim: 485512 34
>
> require(Illumina450ProbeVariants.db)
> ?probe.450K.VCs.af ## read the docs!
> dataprobe.450K.VCs.af)
>
> commonSnpInProbe <- rownamesprobe.450K.VCs.af)[
probe.450K.VCs.af$probe50VC.com1pop
> > 0 ]
> grSet.noCommonSNPsInProbes <- grSet[
> setdiff(rownames(grSet), commonSnpInProbe), ]
> grSet.noCommonSNPsInProbes
> ## class: GenomicRatioSet
> ## dim: 308901 34
>
> commonVariantsEitherStrand <-
!is.naprobe.450K.VCs.af$diVC.com1pop.F)|!
> is.naprobe.450K.VCs.af$diVC.com1pop.R)
> CpGsWithCommonVariants <- rownamesprobe.450K.VCs.af)[
commonVariantsEitherStrand
> ]
> grSet.noCommonVariantsAtCpGs <- grSet[
> setdiff(rownames(grSet),CpGsWithCommonVariants), ]
> grSet.noCommonVariantsAtCpGs
> ## class: GenomicRatioSet
> ## dim: 445502 34
>
> Consult the documentation for finer control over the process (e.g.
> specific populations, etc.).
>
> ?probe.450K.VCs.af ## read the docs again
>
> It would not break my heart if the maintainers turned the data.frame
here
> into a GRanges; a few small changes followed by data.frame2GRanges()
would
> do the trick. But still, it's a very handy compilation.
> See below for the reason I prefer GRanges (in short, because I'm
impatient
> and don't like debugging).
>
>
> 2) Use the FDb package (> 1% MAF across all populations, dbSNP build
135).
> This is a one-liner.
>
> require(minfi)
> grSet
> ## class: GenomicRatioSet
> ## dim: 485512 34
>
> require(FDb.UCSC.snp135common.hg19)
> commonSNPs <- features(FDb.UCSC.snp135common.hg19)
>
> ## With a GRanges object, the previous rigamarole becomes a one-
liner:
> grSet.noCommonSNPsAtCpGs <- grSet[ grSet %outside% commonSNPs, ]
>
> grSet.noCommonSNPsAtCpGs
> ## class: GenomicRatioSet
> ## dim: 478385 34
>
> This was my workaround from 2-3 years ago to permit masking of TCGA
Level
> 3 methylation data. It still works and it still works well, but
it's been
> superseded (IMHO) by more flexible approaches like #1.
>
>
> 3) use SNPlocs (all SNPs in dbSNP; mildly annoying complication with
'ch'
> vs. 'chr' seqlevels)
>
> require(minfi)
> grSet
> ## class: GenomicRatioSet
> ## dim: 485512 34
>
> ## work around the annoyance:
> chroms <- seqlevels(grSet)
> names(chroms) <- chroms
> chroms <- gsub('chr', 'ch', chroms)
> require(SNPlocs.Hsapiens.dbSNP.20120608)
> SNPs.byChr <- GRangesList(lapply(chroms, getSNPlocs,
as.GRanges=TRUE))
> ## time passes...
> seqlevels(SNPs.byChr) <- gsub('ch', 'chr', seqlevels(SNPs.byChr)) ##
back
> to normal
> genome(SNPs.byChr) <- 'hg19' ## GRCh37.p5 coordinates are identical
to
> hg19 save for chrMT
>
> ## once the above hoops have been jumped through, it's back to one-
liners:
> grSet.noSNPsAtCpGs <- grSet[ grSet %outside% SNPs.byChr, ]
> grSet.noSNPsAtCpGs
> ## class: GenomicRatioSet
> ## dim: 444722 34
>
> This used to be documented in the minfi code/manual somewhere,
though I
> don't know if it still is.
>
>
> Statistics is the grammar of science.
> Karl Pearson <http: en.wikipedia.org="" wiki="" the_grammar_of_science="">
>
>
> On Fri, Jan 31, 2014 at 1:06 PM, C T <off... at="" gmail.com="" <javascript:="">>wrote:
>
>> Any tutorial on how to remove probes that contains SNPs?
>>
>> On Tuesday, November 12, 2013 7:12:46 PM UTC-5, Kasper Hansen
wrote:
>> >
>> > As part of Bioconductor 2.13, we have released minfi 1.8.x. Due
to a
>> > number of last minute errors, the recommended version is 1.8.3
(or
>> bigger).
>> >
>> > Users may find that their old objects cannot be linked to
annotation.
>> > Please run
>> > OBJECT = updateObject(OBJECT)
>> > to fix this.
>> >
>> > Highlights include
>> > * preprocessingQuantile(): an independent implementation of the
same
>> ideas
>> > as in Tost et al.
>> > * bumphunter() for finding DMRs
>> > * blockFinder() for finding large hypo-methylated blocks on the
450k
>> array.
>> > * estimateCellCounts() for estimating cell type composition for
whole
>> > blood samples. The function can be extended to work on other
types of
>> > cells, provided suitable flow sorted data is available.
>> > * the annotation now includes SNP annotation for dbSNP v132, 135
and
>> 137,
>> > independently annotated at JHU.
>> > * getSex(): you can now get sex repeatedly, irrespective of
relationship
>> > status.
>> > * minfiQC: find and remove outlier samples based on a sample QC
criteria
>> > we have found effective.
>> >
>> > Unfortunately, none of these handy changes are yet detailed in
the
>> > vignette; we are working on this.
>> >
>> > A manuscript is in review detailing most of these functions.
>> >
>> > Full NEWS below
>> >
>> > Best,
>> > Kasper D Hansen
>> >
>> > o Added getMethSignal(), a convenience function for
programming.
>> >
>> > o Changed the argument name of "type" to "what" for
>> getMethSignal().
>> >
>> > o Added the class "RatioSet", like "GenomicRatioSet" but
without
>> the
>> > genome information.
>> >
>> > o Bugfixes to the "GenomicRatioSet()" constructor.
>> >
>> > o Added the method ratioConvert(), for converting a
"MethylSet"
>> to a
>> > "RatioSet" or a "GenomicMethylSet" to a
"GenomicRatioSet".
>> >
>> > o Fixed an issue with GenomicMethylSet() and
GenomicRatioSet()
>> caused
>> > by a recent change to a non-exported function in the
>> GenomicRanges
>> > package (Reported by Gustavo Fernandez Bayon <gba... at="" gmail.com="">> <javascript:>
>> > >).
>> >
>> > o Added fixMethOutliers for thresholding extreme
observations in
>> the
>> > [un]methylation channels.
>> >
>> > o Added getSex, addSex, plotSex for estimating sex of the
samples.
>> >
>> > o Added getQC, addQC, plotQC for a very simple quality
control
>> > measure.
>> >
>> > o Added minfiQC for a one-stop function for quality control
>> measures.
>> >
>> > o Changed some verbose=TRUE output in various functions.
>> >
>> > o Added preprocessQuantile.
>> >
>> > o Added bumphunter method for "GenomicRatioSet".
>> >
>> > o Handling signed zero in minfi:::.digestMatrix which
caused unit
>> > tests to fail on Windows.
>> >
>> > o addSex and addQC lead to sampleNames() being dropped
because of
>> a
>> > likely bug in cbind(DataFrame, DataFrame). Work-around
has been
>> > implemented.
>> >
>> > o Re-ran the test data generator.
>> >
>> > o Fixed some Depends and Imports issues revealed by new
features
>> of R
>> > CMD check.
>> >
>> > o Added blockFinder and cpgCollapse.
>> >
>> > o (internal) added convenience functions for argument
checking.
>> >
>> > o Exposed and re-wrote getAnnotation().
>> >
>> > o Changed getLocations() from being a method to a simple
function.
>> > Arguments have been removed (for example, now the
function
>> always
>> > drops non-mapping loci).
>> >
>> > o Implemented getIslandStatus(), getProbeType(),
getSnpInfo() and
>> > addSnpInfo(). The two later functions retrieve pre-
computed SNP
>> > overlaps, and the new annotation object includes SNPs
based on
>> > dbSNP 137, 135 and 132.
>> >
>> > o Changed the IlluminaMethylatioAnnotation class to now
include
>> > genomeBuild information as well as defaults.
>> >
>> > o Added estimateCellCounts for deconvolution of cell types
in
>> whole
>> > blood. Thanks to Andrew Jaffe and Andres Houseman.
>> >
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Biocon... at r-project.org <javascript:>
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>>
>