I have 1000 candidate genes with their expression (FPKM). I would like to test the gene-expression difference between two groups of Individuals with heart_Failure and normal. In the linear model I have some covariates like "age", "sex", "BMI" and "Hypertension", however there are some missing values (NA) in the covariate file. I created the design matrix as follows:
design <- model.matrix(~age+sex+BMI+ Hypertension+group) ; Note: group (No=normal , Yes=heart_Failure)
after I run the linear model as: fit <- lmFit(expression,design), It gave me an error because of some missing values in the model. How to fix this problem?
All values in "expression" and "group" are complete (no-missing), just I have missing values in covariables. How to fix this issue. Could you please advise by sending an example. Thank you
Indeed, if you have
NA
values in the covariates you supply tomodel.matrix
, the function will automatically remove the corresponding samples. This can result in your design matrix not being of full rank.