Hi
I have a multifactorial experiment with three factors: two levels of light treatment( high and low), two levels for tissues (cotyledon and hypocotyl), and finally I have a time course with four levels (15, 45, 90, 180 including the time 0). We have 57 libraries with replicates.
> coldata
tissue light time
000_A_C1 Cotyledon High_FR 0
000_A_C2 Cotyledon High_FR 0
000_A_H1 Hipocotyl High_FR 0
000_A_H2 Hipocotyl High_FR 0
000_A_H3 Hipocotyl High_FR 0
015_A_C1 Cotyledon High_FR 15
015_A_C2 Cotyledon High_FR 15
015_A_H1 Hipocotyl High_FR 15
015_A_H2 Hipocotyl High_FR 15
015_A_H3 Hipocotyl High_FR 15
015_A_H4 Hipocotyl High_FR 15
015_B_C1 Cotyledon Low_FR 15
015_B_C2 Cotyledon Low_FR 15
015_B_H1 Hipocotyl Low_FR 15
015_B_H2 Hipocotyl Low_FR 15
045_A_C1 Cotyledon High_FR 45
045_A_C2 Cotyledon High_FR 45
045_A_C3 Cotyledon High_FR 45
045_A_H1 Hipocotyl High_FR 45
045_A_H2 Hipocotyl High_FR 45
045_A_H3 Hipocotyl High_FR 45
045_A_H4 Hipocotyl High_FR 45
045_B_C1 Cotyledon Low_FR 45
045_B_C2 Cotyledon Low_FR 45
045_B_C3 Cotyledon Low_FR 45
045_B_H1 Hipocotyl Low_FR 45
045_B_H2 Hipocotyl Low_FR 45
045_B_H3 Hipocotyl Low_FR 45
045_B_H4 Hipocotyl Low_FR 45
090_A_C1 Cotyledon High_FR 90
090_A_C2 Cotyledon High_FR 90
090_A_C3 Cotyledon High_FR 90
090_A_C4 Cotyledon High_FR 90
090_A_H1 Hipocotyl High_FR 90
090_A_H2 Hipocotyl High_FR 90
090_A_H3 Hipocotyl High_FR 90
090_A_H4 Hipocotyl High_FR 90
090_B_C1 Cotyledon Low_FR 90
090_B_C2 Cotyledon Low_FR 90
090_B_C3 Cotyledon Low_FR 90
090_B_H1 Hipocotyl Low_FR 90
090_B_H2 Hipocotyl Low_FR 90
090_B_H3 Hipocotyl Low_FR 90
090_B_H4 Hipocotyl Low_FR 90
180_A_C1 Cotyledon High_FR 180
180_A_C2 Cotyledon High_FR 180
180_A_C3 Cotyledon High_FR 180
180_A_H1 Hipocotyl High_FR 180
180_A_H2 Hipocotyl High_FR 180
180_A_H3 Hipocotyl High_FR 180
180_B_C1 Cotyledon Low_FR 180
180_B_C2 Cotyledon Low_FR 180
180_B_C3 Cotyledon Low_FR 180
180_B_C4 Cotyledon Low_FR 180
180_B_H1 Hipocotyl Low_FR 180
180_B_H2 Hipocotyl Low_FR 180
180_B_H3 Hipocotyl Low_FR 180
To start analyzing the data, I designed the following model
coldata
> dds <- DESeqDataSetFromMatrix(
countData = countData,
colData = coldata,
design = ~ light + time + tissue + light:tissue)
> ds
Pre-filtering runs collapsed
dds <- ddsCollapsed[ rowSums(counts(ddsCollapsed)) > 50, ]
dds
I have been able to perform a contrast of the light treatment
##Deseq2
> dds <- DESeq(dds)
> resultsNames(dds)
#Contrast high light vs low light
> res_llow_vs_lhigh <- results(dds,contrast=c("light","Low_FR","High_FR"))
> res_llow_vs_lhigh <- res_llow_vs_lhigh[order(res_llow_vs_lhigh$padj),] # Sorting
> head (res_llow_vs_lhigh) log2 fold change (MLE): light Low_FR vs High_FR Wald test p-value: light Low FR vs High FR DataFrame with 6 rows and 6 columns baseMean log2FoldChange lfcSE stat pvalue padj <numeric> <numeric> <numeric> <numeric> <numeric> <numeric> AT1G02340 2083.4933 3.361691 0.1856905 18.10373 2.978046e-73 6.302439e-69 AT4G16780 2473.4689 4.248042 0.2401146 17.69173 4.856344e-70 5.138741e-66 AT1G04180 134.6161 6.273875 0.4672011 13.42864 4.108919e-41 2.898569e-37 AT3G21330 821.5631 5.059732 0.4170330 12.13269 7.088210e-34 3.750195e-30 AT1G21050 994.0203 1.477917 0.1245895 11.86229 1.858119e-32 6.593775e-29 AT2G46970 679.6482 6.868255 0.5790236 11.86179 1.869425e-32 6.593775e-29 |
But now Im interested to view the genes that are differentially expressed in the time course comparing between tissues, but when I modify the model design with the time interactions it throws me an error : "Model is not full rank".
> dds <- DESeqDataSetFromMatrix( + countData = countData, + colData = coldata, + design = ~ light + time + tissue + light:tissue + light:time) Error in checkFullRank(modelMatrix) : the model matrix is not full rank, so the model cannot be fit as specified. One or more variables or interaction terms in the design formula are linear combinations of the others and must be removed. Please read the vignette section 'Model matrix not full rank': vignette('DESeq2')
I read the vignette and I think that the possible reason is the first: one or more columns in my model matrix are linear combinations of other columns. Could you help me fix my data set to turn it into a model matrix full rank?
Another question is How can I analyze a time series experiment? I read the vignette from "airway", but I have doubts about how they set up the dataset in the package?
Regards
Hi Michael, thanks for your answer. In this case, I want to compare the time 0 vs all the times (0 vs 15, 0 vs 45 ..... etc), to observe through time that genes change in contrast to time 0, both in cotyledons and in hypocotyls while you apply the light treatment. But I think that I have to consider the interactions.
What do you recommend me?
Regards
It’s a relatively complex design so I’d recommend meeting with a statistician so you get the design and contrasts that you want. There is nothing special about the way DESeq2 will generate coefficients from standard R formula for linear models.