Hello,
Just a simple question about the package MEDIPS
. I have found this to be a great source fo MeDIP-seq, however, one point which is bugging me is that the counts present in the resulting file is not the TMM normalized counts used for differential coverage testing (for the MEDIPS.meth
functions. Is anyone aware of how I can extract the normalized values?
Thanks!
Hello Lukas. Thank you for the reply. After coming back to this after a while and normalizing by quantile it seems like the counts are actually normalized by it and this outputs is represented in the table.
So I would like to just double check the following conclusions are correct:
When quantile normalization are selected:
Thanks so much on the checks!
Hi,
Thanks for your inquiry!
Yes, the ‘count’ columns in the result table contain the quantile normalized values. The warning is there to emphasize that the rpkm and rms values in the other columns are not quantile normalized.
The quantile normalized counts are indeed what edgeR sees (in case you opted for that normalization). Whether it makes sense to visualize gene-level methylation values is another topic (do you mean average methylation values across a gene?).
All the best, Lukas
Hi Lukas, thank you for the quick reply!
Understood on the warning. Makes sense now.
Regarding visualization - simply put, yes on a "per row" basis from the outputs of MEDIPS. In short my workflow has specific windows of ROIs (e.g.: promoters, gene bodies which are later annotated to gene names), thus for each row the methylation quantification are for these ranges.
The goal is to plot methylation values for each biological replicate from ranges of interest. For that we need normalized methylation (very similar to how one can do this with RNA-seq with DESeq2) to plot it accurately. At first, when looking at TMMs and using the counts, it was not making much sense which is why I originally asked the question.
But now with quantile, it seems a lot better and appropriate to use these values to represent normalized methylation across the windows we define (aka each row from MEDIPS table).
It's worth noting that the version of MEDIPS utilized was 1.34.0 but I don't think it has change with the newest one (some value differences but for the purposes of this question, I believe the concept is the same as it relates to my main questions).
Hi Ishepard,
I am glad to hear that the quantile normalized values give you reasonable visualizations. The TMM normalization happens within edgeR and there will be a scaling factor that, I think, goes into the model. Quantile normalization is done by MEDIPS and the results are provided to edgeR. Therefore, I can write out the quantile normalized values in the result table.
I think your approach makes total sense. It's always just the question what is a reasonable region of interest. Does it make sense to calculate a mean methylation value across the entire gene body. Not sure, maybe, maybe not. Depends on your question. The only other thing is that the counts/ quantile normalized counts are not normalized by CpG density. While this might be neglect-able when you do a differential analysis, you still might want to plot actual %methylation values. The rms values in MEDIPS where supposed to reflect that, but honestly I would strongly recommend the qsea package for transforming counts into %methylation values (if that's something you want to do).
All the best, Lukas