bsseq and BSmooth with RRBS

0

Entering edit mode

Alex Gutteridge ▴ 650

@alex-gutteridge-2935

Last seen 10.6 years ago

United States

I'm looking over the documentation for bsseq: http://www.bioconductor.org/packages/2.11/bioc/html/bsseq.html And the associated Genome Biology paper: http://genomebiology.com/content/pdf/gb-2012-13-10-r83.pdf From what I can see the method should work on reduced representation bisulfite sequencing (RRBSeq) datasets as well as WGBSeq, though the documentation only mentions WGBSeq. The only issue I can see being the smoothing procedure which I thought might struggle in regions with scattered CpGs in the RRBSeq. I will have data in hand shortly to test this myself, but was just curious if anyone on the list had run bsseq (successfully or otherwise) on RRBS data before? -- Alex Gutteridge

Sequencing bsseq Sequencing bsseq • 2.3k views

ADD COMMENT • link updated 12.1 years ago by Kasper Daniel Hansen ★ 6.5k • written 12.1 years ago by Alex Gutteridge ▴ 650

0

Entering edit mode

Kasper Daniel Hansen ★ 6.5k

@kasper-daniel-hansen-2979

Last seen 21 months ago

United States

On Thu, Feb 21, 2013 at 11:31 AM, Alex Gutteridge <alexg at="" ruggedtextile.com=""> wrote: > I'm looking over the documentation for bsseq: > > http://www.bioconductor.org/packages/2.11/bioc/html/bsseq.html > > And the associated Genome Biology paper: > > http://genomebiology.com/content/pdf/gb-2012-13-10-r83.pdf > > From what I can see the method should work on reduced representation > bisulfite sequencing (RRBSeq) datasets as well as WGBSeq, though the > documentation only mentions WGBSeq. The only issue I can see being the > smoothing procedure which I thought might struggle in regions with scattered > CpGs in the RRBSeq. > > I will have data in hand shortly to test this myself, but was just curious > if anyone on the list had run bsseq (successfully or otherwise) on RRBS data > before? I don't have any hands on experience with RRBS, but I agree with your assessment that in principle it should work, depending on how large regions are actually captured in your experiment. Mathematically, it should run out of the box. Pay close attention to the maxGap parameter which sets a limit for how far away two neighboring CpGs can be before not smoothing. I have had inquires like yours about using it for RRBS data, but I never heard back about their experience (and as I recall it, it was much earlier in the development phase, so the software ought to work much better now). I would be happy to hear your experience. Note that currently, (some of) the statistics are really meant for the case where you have biological replicates, but that question is really orthogonal to the choice of assay. Best, Kasper > -- > Alex Gutteridge > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD COMMENT • link 12.1 years ago Kasper Daniel Hansen ★ 6.5k

0

Entering edit mode

On 21.02.2013 17:52, Kasper Daniel Hansen wrote: > On Thu, Feb 21, 2013 at 11:31 AM, Alex Gutteridge > <alexg at="" ruggedtextile.com=""> wrote: >> I'm looking over the documentation for bsseq: >> >> http://www.bioconductor.org/packages/2.11/bioc/html/bsseq.html >> >> And the associated Genome Biology paper: >> >> http://genomebiology.com/content/pdf/gb-2012-13-10-r83.pdf >> >> From what I can see the method should work on reduced representation >> bisulfite sequencing (RRBSeq) datasets as well as WGBSeq, though the >> documentation only mentions WGBSeq. The only issue I can see being >> the >> smoothing procedure which I thought might struggle in regions with >> scattered >> CpGs in the RRBSeq. >> >> I will have data in hand shortly to test this myself, but was just >> curious >> if anyone on the list had run bsseq (successfully or otherwise) on >> RRBS data >> before? > > I don't have any hands on experience with RRBS, but I agree with your > assessment that in principle it should work, depending on how large > regions are actually captured in your experiment. Mathematically, it > should run out of the box. Pay close attention to the maxGap > parameter which sets a limit for how far away two neighboring CpGs > can > be before not smoothing. > > I have had inquires like yours about using it for RRBS data, but I > never heard back about their experience (and as I recall it, it was > much earlier in the development phase, so the software ought to work > much better now). > > I would be happy to hear your experience. > > Note that currently, (some of) the statistics are really meant for > the > case where you have biological replicates, but that question is > really > orthogonal to the choice of assay. > > Best, > Kasper Thanks Kasper, I will give it a go and report back. I was also looking at modelling this on an individual locus basis with a binomial glm. I.e. With made up data: > M [1] 10 10 10 30 40 10 > U [1] 10 20 15 3 10 0 > y = cbind(M,U) > group = factor(c(0,0,0,1,1,1)) > model = glm(y~group,binomial) > summary(model) Call: glm(formula = y ~ group, family = binomial) Deviance Residuals: 1 2 3 4 5 6 0.9036 -0.7537 0.0000 0.8569 -1.1656 1.7354 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -0.4055 0.2357 -1.720 0.0854 . group1 2.2225 0.3808 5.837 5.31e-09 *** --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 46.8213 on 5 degrees of freedom Residual deviance: 6.4888 on 4 degrees of freedom AIC: 28.198 Number of Fisher Scoring iterations: 4 Where M is the number of methylated reads at a single locus across 6 samples and U is the number of unmethylated reads at the same locus in the same samples. Group is a factor giving the design of the experiment (here two groups both with n=3). Is the conclusion from your work that pre-smoothing the data gives better performance than this naive way for detecting DMRs? If we were interested in testing specific CpG loci (as opposed to broader regions) would this approach make sense? -- Alex Gutteridge

ADD REPLY • link 12.1 years ago Alex Gutteridge ▴ 650

Login before adding your answer.