I performing a Time-series experiment with several patients with 2 conditions and 2 time-points from each patient. The patients associated to the first condition are different respect to the second one. The experiment consist of 24 patients, 12 of them who respond to one treatment and the other 12 who not respond to the same treatment. the RNA raw count data was performed for two times in each patient (time=0 and time=2). I would like to compare the differential expression between responders and non-responders taking into account both time-series. However, i cannot create the matrix in the step "DESeqDataSetFromMatrix" when I'm introducing the model matrix (like this ~ phenotypeData + phenotypeData:patient.n + phenotypeData:time, coldata))with the conditions. My code is shown below:
The experiment design coldata
phenotypeData patient time patient.n
1 responders 1 t0 1
2 responders 1 t2 1
3 responders 2 t0 2
4 responders 2 t2 2
5 responders 3 t0 3
6 responders 3 t2 3
7 responders 4 t0 4
8 responders 4 t2 4
.
.
.
43 nonresponders 22 t0 22
44 nonresponders 22 t2 22
45 nonresponders 23 t0 23
46 nonresponders 23 t2 23
47 nonresponders 24 t0 24
48 nonresponders 24 t2 24
and the CountData is like this:
X1_SAL008_0_R X2_SAL008_2_R X5_JC022_0_R X6_JC022_2_R X9_NJ029_0_R X10_NJ029_2_R X13_IL008_0_R
ENSG00000000003 3.05 20.01 24.38 8.47 6.11 7.28 6.00
ENSG00000000005 0.00 0.00 0.00 0.00 0.00 0.00 0.00
ENSG00000000419 222.00 122.00 165.00 124.00 125.00 152.00 45.00
ENSG00000000457 160.33 342.62 314.02 295.18 228.01 244.49 86.41
And so on....
And my DESeq2 code:
ListRNAseqTimeSeries<-"raw_data_matrix_allSamples_RvsNR_newDESeqFINAL2.csv"
ListRNAseqTimeSeries
coldata <- DataFrame(phenotypeData=factor(c(rep("responders",each=24),rep("nonresponders",each=24))),
patient=factor(c(rep(1:24,each=2))),
time=factor(rep(c("t0","t2"),each=1)))
coldata$patient.n <- factor(c(rep(1:24,each=2)))
coldata$phenotypeData <- relevel(coldata$phenotypeData, "nonresponders")
RNA_rawMatrix <- data.matrix(read.csv(ListRNAseqTimeSeries, header = T, sep = "\t", dec = ".", row.names = 1))
RNA_rawMatrix2DESeq <- floor(RNA_rawMatrix) ## DESeq only allows integer values
RNAMatrix <- model.matrix(~ phenotypeData + phenotypeData:patient.n + phenotypeData:time, coldata, tidy = F)
RNAMatrix2 <- model.matrix(~ phenotypeData + phenotypeData:patient.n, coldata, tidy = F)
idx <- which(apply(RNAMatrix, 2, function(x) all(x==0)))
idx2 <- which(apply(RNAMatrix2, 2, function(x) all(x==0)))
RNAMatrix <- RNAMatrix[,-idx]
RNAMatrix2 <- RNAMatrix2[,-idx]
RNA_Data <- DESeqDataSetFromMatrix(countData = RNA_rawMatrix2DESeq, colData = coldata, design = RNAMatrix)
RNA_DE <- DESeq (RNA_Data, test = "LRT", reduced = RNAMatrix2)
Log2FC <- results(RNA_DE)
RNA_DE <- estimateSizeFactors(RNA_DE)
NormMatrix<-counts(RNA_DE, normalized=TRUE)
write.table (Log2FC,file = paste("RNA_DE",substr(ListRNAseqTimeSeries,27,70), sep=""),sep ="\t",row.names = TRUE,col.names = TRUE,quote=F)
write.table (NormMatrix,file = paste("RNA_NormMatrix",substr(ListRNAseqTimeSeries,27,40), sep=""),sep ="\t",row.names = TRUE,col.names = TRUE,quote=F)r
But i have the following error:
Error in checkFullRank(modelMatrix) : the model matrix is not full rank, so the model cannot be fit as specified. One or more variables or interaction terms in the design formula are linear combinations of the others and must be removed.
And i don't know exactly how solve this error. Thanks so much in advance for further information in order to solve this problem