Problem making DESeq dataset; Error all variables in design formula must be columns in colData
1
1
Entering edit mode
@wannesnauwynck-10609
Last seen 8.5 years ago

Hi,

I just started a differential expression analysis using DESeq2 with a count dataset. There are two cellines in which I want to detect the DE ; a radiated one and a control one. Now I have used DESeq2 and DESeqDataSetFromMatrix before and they have always worked fine for me, but now for some reason when I want to run DESeqDataSetFromMatrix, the function displays an error message. These are my inputs

head(counts)

         X01 X02 X03 X04
A1BG     241  48 225 129
A1BG-AS1  46  14  34  45
A1CF      28   5  18  28
A2M        2   0   1   0
A2M-AS1    0   0   0   0
A2ML1     11   1   1   4

head(mycols)    #coldata

condition = as.factor(c(rep("Ctr",2),rep("Irr",2)))
mycols = data.frame(row.names = c("X01","X02","X03","X04"),condition)
mycols
    condition
X01       Ctr
X02       Ctr
X03       Irr
X04       Irr

>dsd = DESeqDataSetFromMatrix(countData = counts,colData = mycols, design ~ condition) 

Error in DESeqDataSet(se, design = design, ignoreRank) :
  all variables in design formula must be columns in colData

I don't get error at all, as far as I know, all variables in the design formula (condition here) ARE included as a column in the colData!

I've always done it this way and it's the first time I received this error so I don't know what to do here.

Any help would be greatly appreciated, thanks!!

deseq2 deseq counts rnaseq differential gene expression • 18k views
ADD COMMENT
5
Entering edit mode

Hello!

Is it possible that you forgot the "=" in 

DESeqDataSetFromMatrix(countData = counts,colData = mycols, design        =       ~ condition) 

?

ADD REPLY
0
Entering edit mode

haha, yes that did the trick, God I'm dumb! Thanks so much for your comment!

ADD REPLY
0
Entering edit mode

I'm not sure how, but the same "countData = " portion was also deleted from my previously working script. This solved it immediately, thank you for posting and sharing!

ADD REPLY
0
Entering edit mode
@mikelove
Last seen 21 hours ago
United States

Already answered by Radek in comment.

While not the most helpful error message here, the reason that error was thrown is because "design" is in your design = design ~ condition, and obviously not a variable.

ADD COMMENT
0
Entering edit mode

Hello, 

I got the same error "Error in DESeqDataSet(se, design = design, ignoreRank) : 
  all variables in design formula must be columns in colData"

my code is:

fulldds <- DESeqDataSetFromMatrix(countData = cts, colData = mat2, design = ~ cnd)

I am trying to estimate group effect using a nested model followin the tutorial in https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#differential-expression-analysis.

Could you give me some advice?

cheers, 

Luca

ADD REPLY
0
Entering edit mode

Let's parse the error message. Your design formula is ~cnd, and has one variable "cnd". The error says "all variables in design formula must be columns in colData", i.e. the one variable in the design formula, "cnd", needs to be a column in colData, which is mat2 here. It has to be the exact name, R and DESeq2 can't guess which column you are referring to as "cnd" unless it's exactly the same name. Is "cnd" a column in colData (mat2)?

ADD REPLY
0
Entering edit mode

Thanks a lot for the quick reply.

colnames(mat2)

(Intercept)"         "grpSeili"            "grpRauma:ind.nb"     "grpSeili:ind.nb"   "grpRauma:ind.nc"     "grpSeili:ind.nc"     "grpRauma:ind.nd"     "grpRauma:cndpresent"  "grpSeili:cndpresent"

 

when trying to replace cond with one of the colnames of mat2 I get the same error:

fulldds <- DESeqDataSetFromMatrix(countData = cts, colData = mat2, design = ~ grpSeili:cndpresent)

ADD REPLY
0
Entering edit mode

There's an issue here. It looks like mat2 here is the output of model.matrix. It's best if you put the original variables into colData, that is "grp" and "ind". You'll have lots of problems unless colData contains the actual variables.

ADD REPLY
0
Entering edit mode

I see. 

I found from the link I previously provided that nested designs need some extra coding. 

In that guide, this is suggested:

model.matrix(~ grp + grp:ind.n + grp:cnd, coldata), so that nested effects among groups could be taken into account. From this point, how do I get to DE analysis if I can't use the model.matrix just created?

 

ADD REPLY
0
Entering edit mode

You supply the model matrix directly to the "full" argument:

dds <- DESeq(dds, full=full)
ADD REPLY
0
Entering edit mode

ok, it works!

thanks a lot for your help and patience!

have a good day

-Luca

ADD REPLY
0
Entering edit mode

Also, I had to type "colnames(cts) <- NULL" to avoid an error as found out in an other post, so the colnames of my countData are now null. Not sure this is relevant...

        [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14]
TRINITY_DN293388_c0_g1    0    7    0    1    0    0    0    3    0     0     2     0     0     1
TRINITY_DN206683_c1_g2    0    0  102    0    0    0    0    0    0   442     0     0     0     0
TRINITY_DN217091_c0_g3  123  617  161  123  106  209  102  564  131   287   483   218   473   620
TRINITY_DN216710_c4_g1  106  141   92  120  156  122  211  135  116   127   131    50   166   119
TRINITY_DN269500_c0_g1    0    3    0    0    0    0    0    0    0     0     0     0     0     6
TRINITY_DN219001_c0_g2   82  416  235  102   87   77   90  414  120   276   459    79   369   328

 

 

ADD REPLY

Login before adding your answer.

Traffic: 621 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6