Hi all,
I have a large RRBS dataset that I'm trying to analyze using
https://f1000research.com/articles/6-2055/v1
Following this paper, after filtering the methylated and unmetylated count matrix using the filter
keep <- rowSums(counts_total >= 10) == 100
my DGE object is about 6e5 rows x 100 columns. If I then further filter to promoters I can carry out the differential methylation analysis in predefined gene promoters outlined in the paper without issue, however I cannot estimate dispersion and carry out DMR analysis without doing so because estimateDisp
takes too long to finish (it just doesn't, for me). For clarity , the command is
dge_methyl <- estimateDisp(dge_methyl, design=fullmethyldesign, trend="none", robust=TRUE)
The design matrix is created using the function modelMatrixMeth
Is there a better way to optimize this and subsequent steps using
glmFit,
glmLRT ?
thanks,
zo
How many different conditions do you have? How many columns does the design matrix have?