Dear all,
I have 450k methylation data from 120 samples. I perform differential methylation analysis (using the m-values) for a interval scaled variable (current_phenotype with levels of 1,2,3) with limma which identifies 1336 hypo- and 635 hypermethylated CpG´s.The significant CpG´s (FDR<0.05 & logFC > 2) are then used for an analysis of differentially methylated pathways using the missMethyl package (gometh function). Although I get several differential methylated pathways I have troubles with interpreting the results. Specifically, how do I know whether higher values of "current_phenotpye" are associated with increased or decreased pathway methylation? Moreover, does it make sense to subject both significant hyper- and hypermethylated CpGs to the pathway analysis (since the gometh function does only know which CpG is significant, however not the direction i.e. whether it is hypo- or hypermethylated)?
Thanks for your help,
Philipp
library(limma) library(missMethyl) design2 = model.matrix(~current_phenotype) fit2 = lmFit(m_values, design2) keep <- fit2$Amean > median(fit2$Amean) fitEb <- eBayes(fit2[keep,], robust=T, trend=T) summary(decideTests(fitEb)) (Intercept) current_phenotype -1 812 1336 0 20713 204111 1 184557 635 tt <- topTable(fitEb,coef=2,sort.by="p", p.value=0.05, lfc=2, adjust.method="BH",number=Inf) gst.KEGG <- gometh(sig.cpg=rownames(tt), all.cpg=rownames(m_values), collection="KEGG", prior.prob = T) gst.KEGG <- gst.KEGG[order(gst.KEGG$FDR),] gst.KEGG <- gst.KEGG[gst.KEGG$FDR<0.05,] head(gst.KEGG)
Dear Philipp,
Could you please use the "ADD COMMENT" button if you want to respond to my answer, rather than posting further questions as an Answer? Otherwise the whole dialog will get messed up on the webpage.
The KEGG pathways are not perfectly directional. A gene will be included in a KEGG pathway if it is functionally involved in the operation of the pathway. However this does not necessarily mean that the gene must be up regulated when the pathway is in operation. In some cases, the gene could be down-regulated.
Even if a pathway is up-regulated in your experiment, some of the genes positively associated with the pathway could be down-regulated in your experiment because of cross-talk with other pathways.
When you judge whether a pathways is up or down regulated, you should only be taking notice of very small p-values, preferably less than 1e-6. Don't take any notice of p-values above 0.001 say.
The eBayes() analysis will be valid whether or not you set robust=TRUE, although that option will often increase power, see: http://arxiv.org/abs/1602.08678 . However you must not use trend=TRUE unless your data contains information about absolute intensity.