Hello, I have a set of patients tumors with and without a particular condition. I am using the sesame package to investigate their methylation pattern. So far it's been really helpful but there is something that I don't quite understand and something that I don't know how to do.
I would like to compare the methylation pattern at the gene level with and without the condition, for this I do a differential methylation analysis, for example:
se <- SummarizedExperiment(betas, colData = metaData)
summary = DML(se, ~condition, BPPARAM = BiocParallel::MulticoreParam(4))
test_result = summaryExtractTest(summary)
And now I would like to look at what happened at the gene level, one way is to do something like this
df <- testResult[testResult$Pval_Condition < 0.01 & abs(testResult$Est_Condition) > 0.1,]
result <- testEnrichment(df$Probe_ID, KYCG_buildGeneDBs(df$Probe_ID, max_distance=100000, platform="EPIC"),platform="EPIC")
But then I get something like this:
> estimate p.value log10.p.value test nQ nD overlap cf_Jaccard cf_overlap
> 3.284153 7.438132e-16 -15.12854 Log2(OR) 5618 403 24 0.004002001 0.05955335
>cf_NPMI cf_SorensenDice FDR group dbname
>0.2113224 0.007972098 7.138797e-12 KYCG.EPIC.gene.00000000 ENSG00000278341.1
>gene_name
>AC138028.6
The first question I have is, what is "estimate" and what is "overlap"? Is the former something like the effect size? Is the second the number of probes? I don't seem to find this information anywhere.
The second question is that I would like to do a heatmap with some of these genes and would like to color them by something akin their difference in methylation (this is what one would do with differential expression analysis for example), would it be valid to use the "estimate" above for something like this?
Thank you very much for this! Makes sense. Still so much to learn about this.