DESeq2 design error message with ordered factor but not numeric
chris • 0
Last seen 6.7 years ago


Ran into this with setting up a design and thought it was interesting (and I can't figure out why). We have an RNA-Seq setup that we'd like to do some basic dose-response analysis (i.e., does expression change with increasing dosage of a drug?). We have a numeric column, Dose, that we use for this now. The thought was to change that to an ordered factor which better represents the data (or not). I created a new column as an ordered factor, DoseFactor, for this purpose, which contains the same information, just stored as a factor instead of a numeric.

Dose DoseFactor
   <dbl> <ord>     
 1  0.1  0.1       
 2  0    0         
 3  0.03 0.03      
 4  0.3  0.3       
 5  0    0         
 6  0.1  0.1       
 7  0.1  0.1       
 8  0.1  0.1       
 9  0.03 0.03      
10  0.3  0.3    

If I create the dataset using the numeric column, all is fine. However, with the factor column, I get this message:

DESeqDataSetFromMatrix(countData = countMatrix.1061,
                            colData = samplesubset.1061,
                            design = ~ DoseFactor)

Error in DESeqDataSet(se, design = design, ignoreRank) : 
  design contains one or more variables with all samples having the same value,
  remove these variables from the design

I would really appreciate anyone being able to shed light on this, or point me to the error in my thinking about this variable as a factor.



> sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.4

Thinking out loud, I'm wondering if it might be simpler to just use a combined factor representing multiple conditions and then do contrasts with known information (group D is higher dosage than group A, for example).

Last seen 1 day ago
United States

The test you are seeing gives TRUE if all the values are equal to the first value in the factor. Something like all(DoseFactor == DoseFactor[1]). Can you think of a reason why this would return TRUE? Is that table you are showing the same as the one you use to create the dds?

Thanks, Michael. Yup, that's a subset of the data in the table above. Here are the actual items in DoseFactor used to create the dds. Definitely confusing.

> all(samplesubset.1061$DoseFactor==samplesubset.1061$DoseFactor[1])
> unique(samplesubset.1061$DoseFactor)
[1] 0.1  0    0.03 0.3 
Levels: 0 < 0.03 < 0.1 < 0.3

Still thinking about how to do dose-response...

Also interesting is that it only fails with ordered factor. Using a column of unordered factors of Dose is fine. Not really what I want, but it doesn't error out.

I remember now, we don't have formula support for ordered factors, but you can just supply a matrix to the design argument instead of a formula. You would build the matrix using model.matrix().

I need to figure out why the correct error message isn't being triggered (there is one, for ordered) and fix that in devel.

Figured out what was wrong with my code, and fixed this in devel branch so a more useful error message will be printed.

My code was assuming that I would get back a single character string from calling class() on a design variable, but this isn't what happens:

> class(ordered(1:3))
[1] "ordered" "factor"
Excellent, thanks for taking a look! Though, upon more thought, I think ordered factor doesn't really give us the dose-response information we would need above what an unordered factor would. Probably best to stick with the continuous variable. 


