Hello,
I am using dmrseq R package to call for differentially methylated regions (DMR) present in exons from ~8.000 selected genes (starting from Bismark output files).
I have seen the default parameters in dmrseq are adapted to WGBS (data across all the genome).
Since I would be only interested in DMRs across these exons, would you recommend adjusting some dmrseq parameters? Or maybe run dmrseq with the whole genome data and subset to my regions afterwards?
I would prefer to run the analysis only for my regions of interest in order to reduce computational cost, but I am not sure how this could impact my results.
Many thanks in advance!
Hi,
Thanks for your question. If you only wish to identify DMRs within the exons of a subset of genes, I would advise to alter some of the default parameters related to the smoothing and maximum spacing allowed between CpGs in DMRs, so that your regions of interest aren't grouped together in either step. Specifically, I'd set
maxGapSmooth
andmaxGap
to a smaller value (the first controls the max spacing between CpGs that are smoothed together and the second controls the max spacing between CpGs that will be grouped in the same DMR) - a value that is larger than the typical CpG spacing within your regions of interest, but smaller than the typical spacing between your regions of interest. Hope that helps!Best, Keegan