bsseq and BSmooth with RRBS
1
0
Entering edit mode
@alex-gutteridge-2935
Last seen 10.2 years ago
United States
I'm looking over the documentation for bsseq: http://www.bioconductor.org/packages/2.11/bioc/html/bsseq.html And the associated Genome Biology paper: http://genomebiology.com/content/pdf/gb-2012-13-10-r83.pdf From what I can see the method should work on reduced representation bisulfite sequencing (RRBSeq) datasets as well as WGBSeq, though the documentation only mentions WGBSeq. The only issue I can see being the smoothing procedure which I thought might struggle in regions with scattered CpGs in the RRBSeq. I will have data in hand shortly to test this myself, but was just curious if anyone on the list had run bsseq (successfully or otherwise) on RRBS data before? -- Alex Gutteridge
Sequencing bsseq Sequencing bsseq • 2.2k views
ADD COMMENT
0
Entering edit mode
@kasper-daniel-hansen-2979
Last seen 17 months ago
United States
On Thu, Feb 21, 2013 at 11:31 AM, Alex Gutteridge <alexg at="" ruggedtextile.com=""> wrote: > I'm looking over the documentation for bsseq: > > http://www.bioconductor.org/packages/2.11/bioc/html/bsseq.html > > And the associated Genome Biology paper: > > http://genomebiology.com/content/pdf/gb-2012-13-10-r83.pdf > > From what I can see the method should work on reduced representation > bisulfite sequencing (RRBSeq) datasets as well as WGBSeq, though the > documentation only mentions WGBSeq. The only issue I can see being the > smoothing procedure which I thought might struggle in regions with scattered > CpGs in the RRBSeq. > > I will have data in hand shortly to test this myself, but was just curious > if anyone on the list had run bsseq (successfully or otherwise) on RRBS data > before? I don't have any hands on experience with RRBS, but I agree with your assessment that in principle it should work, depending on how large regions are actually captured in your experiment. Mathematically, it should run out of the box. Pay close attention to the maxGap parameter which sets a limit for how far away two neighboring CpGs can be before not smoothing. I have had inquires like yours about using it for RRBS data, but I never heard back about their experience (and as I recall it, it was much earlier in the development phase, so the software ought to work much better now). I would be happy to hear your experience. Note that currently, (some of) the statistics are really meant for the case where you have biological replicates, but that question is really orthogonal to the choice of assay. Best, Kasper > -- > Alex Gutteridge > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
On 21.02.2013 17:52, Kasper Daniel Hansen wrote: > On Thu, Feb 21, 2013 at 11:31 AM, Alex Gutteridge > <alexg at="" ruggedtextile.com=""> wrote: >> I'm looking over the documentation for bsseq: >> >> http://www.bioconductor.org/packages/2.11/bioc/html/bsseq.html >> >> And the associated Genome Biology paper: >> >> http://genomebiology.com/content/pdf/gb-2012-13-10-r83.pdf >> >> From what I can see the method should work on reduced representation >> bisulfite sequencing (RRBSeq) datasets as well as WGBSeq, though the >> documentation only mentions WGBSeq. The only issue I can see being >> the >> smoothing procedure which I thought might struggle in regions with >> scattered >> CpGs in the RRBSeq. >> >> I will have data in hand shortly to test this myself, but was just >> curious >> if anyone on the list had run bsseq (successfully or otherwise) on >> RRBS data >> before? > > I don't have any hands on experience with RRBS, but I agree with your > assessment that in principle it should work, depending on how large > regions are actually captured in your experiment. Mathematically, it > should run out of the box. Pay close attention to the maxGap > parameter which sets a limit for how far away two neighboring CpGs > can > be before not smoothing. > > I have had inquires like yours about using it for RRBS data, but I > never heard back about their experience (and as I recall it, it was > much earlier in the development phase, so the software ought to work > much better now). > > I would be happy to hear your experience. > > Note that currently, (some of) the statistics are really meant for > the > case where you have biological replicates, but that question is > really > orthogonal to the choice of assay. > > Best, > Kasper Thanks Kasper, I will give it a go and report back. I was also looking at modelling this on an individual locus basis with a binomial glm. I.e. With made up data: > M [1] 10 10 10 30 40 10 > U [1] 10 20 15 3 10 0 > y = cbind(M,U) > group = factor(c(0,0,0,1,1,1)) > model = glm(y~group,binomial) > summary(model) Call: glm(formula = y ~ group, family = binomial) Deviance Residuals: 1 2 3 4 5 6 0.9036 -0.7537 0.0000 0.8569 -1.1656 1.7354 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -0.4055 0.2357 -1.720 0.0854 . group1 2.2225 0.3808 5.837 5.31e-09 *** --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 46.8213 on 5 degrees of freedom Residual deviance: 6.4888 on 4 degrees of freedom AIC: 28.198 Number of Fisher Scoring iterations: 4 Where M is the number of methylated reads at a single locus across 6 samples and U is the number of unmethylated reads at the same locus in the same samples. Group is a factor giving the design of the experiment (here two groups both with n=3). Is the conclusion from your work that pre-smoothing the data gives better performance than this naive way for detecting DMRs? If we were interested in testing specific CpG loci (as opposed to broader regions) would this approach make sense? -- Alex Gutteridge
ADD REPLY

Login before adding your answer.

Traffic: 784 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6