I am trying to see the effect of different conditions at different time points, but after going through all the posts and vignettes I am confused.
This is where I am getting the error, that the model matrix is not full rank. After reading the posts I understood that I have to create another column with nested value, but I don't understand with which column? is it the time?
dds <- DESeqDataSetFromMatrix(htseq_data, sampleTable, design = ~ condition + time + condition:time )
And this is my sampleTable:
LIMS Client condition time
SAMPLE-IN-POOL19-0050-R0001.counts BU_243 PH 1
SAMPLE-IN-POOL19-0051-R0001.counts BU_244 PH 1
SAMPLE-IN-POOL19-0052-R0001.counts BU_245 PH 1
SAMPLE-IN-POOL19-0031-R0001.counts BU_247 TCP 1
SAMPLE-IN-POOL19-0032-R0001.counts BU_248 TCP 1
RNA20-0041-R0001.counts BU_249 TCP 1
SAMPLE-IN-POOL19-0039-R0001.counts BU_258 CO 0
SAMPLE-IN-POOL19-0040-R0001.counts BU_259 CO 0
SAMPLE-IN-POOL19-0041-R0001.counts BU_260 CO 0
SAMPLE-IN-POOL19-0042-R0001.counts BU_261 PH 6
SAMPLE-IN-POOL19-0043-R0001.counts BU_263 PH 6
SAMPLE-IN-POOL19-0044-R0001.counts BU_264 PH 6
SAMPLE-IN-POOL19-0045-R0001.counts BU_265 TCP 6
SAMPLE-IN-POOL19-0046-R0001.counts BU_266 TCP 6
RNA20-0043-R0001.counts BU_268_bis TCP 6
SAMPLE-IN-POOL19-0035-R0001.counts BU_252 PH 3
SAMPLE-IN-POOL19-0033-R0001.counts BU_250 PH 3
SAMPLE-IN-POOL19-0034-R0001.counts BU_251 PH 3
SAMPLE-IN-POOL19-0036-R0001.counts BU_253 TCP 3
SAMPLE-IN-POOL19-0037-R0001.counts BU_255 TCP 3
RNA20-0042-R0001.counts BU_254 TCP 3
Thank you
CO is nested with timepoint 0, so for the given design you would need to remove CO from the analysis.
But, is there any other way around? I wanted to see the difference in gene expression in all the time points of different conditions compared to control. Thank you
Then you probably (I guess) would need a full-factorial design, so combining the two columns, like condition_time, e.g. PH_1, TCP_1, CO_0 and so on. Lets call this column
factorial
and then use~factorial
as design, and then making contrasts to meaningfully describe your experiment.Thank you. I already did that . But, I am not sure if it gives you the real picture (gene expression in a time-dependent manner) ???
Another Idea I had is to make 2 separate analysis using this design: 1) Time-course analysis with TCP and CO 2) same analysis with PH and CO
Of course, there will P-value difference as compared to what I would find with all the samples included. But, I think it works.
What do you think?
Do you mean use the full design and do the LRT test?