Hi there,
I encountered a problem using DESeq2 adding variables created with RUVSeq aiming to remove unwanted variation in my dataset. I set up my DESeq dataset like this:
dds <- DESeqDataSetFromMatrix(countData = counts, colData = sample_info, design = ~Batch+Condition)
Batch is a factor with 3 levels (1,2,3), Condition is a factor with 4 levels (Ctrl, trt1, trt2, trt3). Each Condition has one sample per batch. In total there are 12 samples.
When I call DESeq(DESeq.ds)
on this, I don't get any error.
With RUVSeq
I generated now some variables and want to use the first 7 to take care of the unwanted variation.
I added those 7 numeric variables (W_1 - W_7) to the colData
slot, and updated the design of the DESeq dataset.
design(dds) <- ~ W_1+W_2+W_3+W_4+W_5+W_6+W_7+Batch+Condition
`
When I run now: dds <- DESeq(dds)
an error is returned:
Error in designAndArgChecker(object, betaPrior) :
full model matrix is less than full rank
When I remove W_7 from the design, the following error is returned:
Error in checkForExperimentalReplicates(object, modelMatrix) :
The design matrix has the same number of samples and coefficients to fit,
so estimation of dispersion is not possible. Treating samples
as replicates was deprecated in v1.20 and no longer supported since v1.22.
`
When I remove W_7 and W_6 from the design, I can run DESeq()
without an error.
Is there a maximum amount of variables that can be added to the design, depending on sample number? Or how can these errors be explained?
Best, Anne
You have 12 covariates, not 9. If you have 3 batches and 4 conditions, that's 2 and 3 covariates, plus the seven surrogate variables, which adds up to 12. You can see that by simply making the design matrix externally and inspecting it.
Ah okay, I understand. Thank you very much. I hadn't thought about that Batch and Condition would be more than one covariate.