Hello everyone,
I'm trying to analyze DNA methylation data with the Limma package to identify potential differentially methylated CpGs between my conditions using M-values. I would like to take into account the proportion of different blood cell fractions in my samples, age, sex and possible batch effects as covariates in my linear analysis model.
This is the code I'm trying to use with the following error message when using lmFit
head(design) fDisease fControl Age SexM CD8T CD4T NK Bcell Mono Neu 1 1 0 53 0 0.0162 0.1618 0.0484 0.0435 0.0853 0.6471 2 0 1 53 0 0.0305 0.1034 0.0430 0.0409 0.1516 0.6391 3 1 0 37 1 0.0814 0.1255 0.0761 0.0718 0.0970 0.5559 4 0 1 36 1 0.0114 0.1454 0.0457 0.0499 0.1328 0.6277 5 1 0 37 1 0.0912 0.1006 0.0620 0.0948 0.0636 0.5912 6 0 1 36 1 0.0703 0.1295 0.0474 0.0359 0.1037 0.6249
fit<-lmFit(mvalues,design)
Coefficients not estimable: NK Bcell Mono Neu Warning message: Partial NA coefficients for 869576 probe(s) ```
I'm not sure to understand why some of my cell fractions are problematic and not others in my model?
I'm also wondering how to integrate into this model any technical batch effects linked in particular to the use of different arrays?
Thank you in advance for any suggestions and help.
Hortense
I think adding
, myLoad$pd
in model.matrix() is misleading/incorrect. Remove it and try again.Try a simple model such as
~0 +f+NK
in order to test if the same error oocurs.Hello Sam,
Thank you for your feedback ! So I have tried your suggestions by removing myLoad$pd and the cell populations that cause problems : NK, Bcell Mono and the Neu.
While this allows me to use the lmFit function without an error message, it causes problems in the rest of the code when I try to apply eBayes :
The strange thing is that I had no problems when I used these lines of code for another data set of the same nature...
I said that
, myLoad$pd
was incorrect, because the variables you transformed are not reintroduced inmyLoad$pd
. It should bemyLoad$pd$Neu<- as.numeric(myLoad$pd$Neu)
. The design shows that those numerical variables are not considered as factors, which is OK.What are
dim(design)
?dim(mvalues)
?